Which are more performant, CTE or Temporary Tables?
It depends.
First of all
What is a Common Table Expression?
A (non recursive) CTE is treated very similarly to other constructs that can also be used as inline table expressions in SQL Server. Derived tables, Views, and inline table valued functions. Note that whilst BOL says that a CTE "can be thought of as temporary result set" this is a purely logical description. More often than not it is not materlialized in its own right.
What is a temporary table?
This is a collection of rows stored on data pages in tempdb. The data pages may reside partially or entirely in memory. Additionally the temporary table may be indexed and have column statistics.
Test Data
CREATE TABLE T(A INT IDENTITY PRIMARY KEY, B INT , F CHAR(8000) NULL);
INSERT INTO T(B)
SELECT TOP (1000000) 0 + CAST(NEWID() AS BINARY(4))
FROM master..spt_values v1,
master..spt_values v2;
Example 1
WITH CTE1 AS
(
SELECT A,
ABS(B) AS Abs_B,
F
FROM T
)
SELECT *
FROM CTE1
WHERE A = 780
Notice in the plan above there is no mention of CTE1. It just accesses the base tables directly and is treated the same as
SELECT A,
ABS(B) AS Abs_B,
F
FROM T
WHERE A = 780
Rewriting by materializing the CTE into an intermediate temporary table here would be massively counter productive.
Materializing the CTE definition of
SELECT A,
ABS(B) AS Abs_B,
F
FROM T
Would involve copying about 8GB of data into a temporary table then there is still the overhead of selecting from it too.
Example 2
WITH CTE2
AS (SELECT *,
ROW_NUMBER() OVER (ORDER BY A) AS RN
FROM T
WHERE B % 100000 = 0)
SELECT *
FROM CTE2 T1
CROSS APPLY (SELECT TOP (1) *
FROM CTE2 T2
WHERE T2.A > T1.A
ORDER BY T2.A) CA
The above example takes about 4 minutes on my machine.
Only 15 rows of the 1,000,000 randomly generated values match the predicate but the expensive table scan happens 16 times to locate these.
This would be a good candidate for materializing the intermediate result. The equivalent temp table rewrite took 25 seconds.
INSERT INTO #T
SELECT *,
ROW_NUMBER() OVER (ORDER BY A) AS RN
FROM T
WHERE B % 100000 = 0
SELECT *
FROM #T T1
CROSS APPLY (SELECT TOP (1) *
FROM #T T2
WHERE T2.A > T1.A
ORDER BY T2.A) CA
Intermediate materialisation of part of a query into a temporary table can sometimes be useful even if it is only evaluated once - when it allows the rest of the query to be recompiled taking advantage of statistics on the materialized result. An example of this approach is in the SQL Cat article When To Break Down Complex Queries.
In some circumstances SQL Server will use a spool to cache an intermediate result, e.g. of a CTE, and avoid having to re-evaluate that sub tree. This is discussed in the (migrated) Connect item Provide a hint to force intermediate materialization of CTEs or derived tables. However no statistics are created on this and even if the number of spooled rows was to be hugely different from estimated is not possible for the in progress execution plan to dynamically adapt in response (at least in current versions. Adaptive Query Plans may become possible in the future).
I'd say they are different concepts but not too different to say "chalk and cheese".
A temp table is good for re-use or to perform multiple processing passes on a set of data.
A CTE can be used either to recurse or to simply improved readability.
And, like a view or inline table valued function can also be treated like a macro to be expanded in the main query
A temp table is another table with some rules around scope
I have stored procs where I use both (and table variables too)
CTE has its uses - when data in the CTE is small and there is strong readability improvement as with the case in recursive tables. However, its performance is certainly no better than table variables and when one is dealing with very large tables, temporary tables significantly outperform CTE. This is because you cannot define indices on a CTE and when you have large amount of data that requires joining with another table (CTE is simply like a macro). If you are joining multiple tables with millions of rows of records in each, CTE will perform significantly worse than temporary tables.
Temp tables are always on disk - so as long as your CTE can be held in memory, it would most likely be faster (like a table variable, too).
But then again, if the data load of your CTE (or temp table variable) gets too big, it'll be stored on disk, too, so there's no big benefit.
In general, I prefer a CTE over a temp table since it's gone after I used it. I don't need to think about dropping it explicitly or anything.
So, no clear answer in the end, but personally, I would prefer CTE over temp tables.
So the query I was assigned to optimize was written with two CTEs in SQL server. It was taking 28sec.
I spent two minutes converting them to temp tables and the query took 3 seconds
I added an index to the temp table on the field it was being joined on and got it down to 2 seconds
Three minutes of work and now its running 12x faster all by removing CTE. I personally will not use CTEs ever they are tougher to debug as well.
The crazy thing is the CTEs were both only used once and still putting an index on them proved to be 50% faster.
I've used both but in massive complex procedures have always found temp tables better to work with and more methodical. CTEs have their uses but generally with small data.
For example I've created sprocs that come back with results of large calculations in 15 seconds yet convert this code to run in a CTE and have seen it run in excess of 8 minutes to achieve the same results.
CTE won't take any physical space. It is just a result set we can use join.
Temp tables are temporary. We can create indexes, constrains as like normal tables for that we need to define all variables.
Temp table's scope only within the session.
EX:
Open two SQL query window
create table #temp(empid int,empname varchar)
insert into #temp
select 101,'xxx'
select * from #temp
Run this query in first window
then run the below query in second window you can find the difference.
select * from #temp
Late to the party, but...
The environment I work in is highly constrained, supporting some vendor products and providing "value-added" services like reporting. Due to policy and contract limitations, I am not usually allowed the luxury of separate table/data space and/or the ability to create permanent code [it gets a little better, depending upon the application].
IOW, I can't usually develop a stored procedure or UDFs or temp tables, etc. I pretty much have to do everything through MY application interface (Crystal Reports - add/link tables, set where clauses from w/in CR, etc.). One SMALL saving grace is that Crystal allows me to use COMMANDS (as well as SQL Expressions). Some things that aren't efficient through the regular add/link tables capability can be done by defining a SQL Command. I use CTEs through that and have gotten very good results "remotely". CTEs also help w/ report maintenance, not requiring that code be developed, handed to a DBA to compile, encrypt, transfer, install, and then require multiple-level testing. I can do CTEs through the local interface.
The down side of using CTEs w/ CR is, each report is separate. Each CTE must be maintained for each report. Where I can do SPs and UDFs, I can develop something that can be used by multiple reports, requiring only linking to the SP and passing parameters as if you were working on a regular table. CR is not really good at handling parameters into SQL Commands, so that aspect of the CR/CTE aspect can be lacking. In those cases, I usually try to define the CTE to return enough data (but not ALL data), and then use the record selection capabilities in CR to slice and dice that.
So... my vote is for CTEs (until I get my data space).
One use where I found CTE's excelled performance wise was where I needed to join a relatively complex Query on to a few tables which had a few million rows each.
I used the CTE to first select the subset based of the indexed columns to first cut these tables down to a few thousand relevant rows each and then joined the CTE to my main query. This exponentially reduced the runtime of my query.
Whilst results for the CTE are not cached and table variables might have been a better choice I really just wanted to try them out and found the fit the above scenario.
I just tested this- both CTE and non-CTE (where the query was typed out for every union instance) both took ~31 seconds. CTE made the code much more readable though- cut it down from 241 to 130 lines which is very nice. Temp table on the other hand cut it down to 132 lines, and took FIVE SECONDS to run. No joke. all of this testing was cached- the queries were all run multiple times before.
This is a really open ended question, and it all depends on how its being used and the type of temp table (Table variable or traditional table).
A traditional temp table stores the data in the temp DB, which does slow down the temp tables; however table variables do not.
From my experience in SQL Server,I found one of the scenarios where CTE outperformed Temp table
I needed to use a DataSet(~100000) from a complex Query just ONCE in my stored Procedure.
Temp table was causing an overhead on SQL where my Procedure was
performing slowly(as Temp Tables are real materialized tables that
exist in tempdb and Persist for the life of my current procedure)
On the other hand, with CTE, CTE Persist only until the following
query is run. So, CTE is a handy in-memory structure with limited
Scope. CTEs don't use tempdb by default.
This is one scenario where CTEs can really help simplify your code and Outperform Temp Table.
I had Used 2 CTEs, something like
WITH CTE1(ID, Name, Display)
AS (SELECT ID,Name,Display from Table1 where <Some Condition>),
CTE2(ID,Name,<col3>) AS (SELECT ID, Name,<> FROM CTE1 INNER JOIN Table2 <Some Condition>)
SELECT CTE2.ID,CTE2.<col3>
FROM CTE2
GO
Related
Is it possible to use join hint with a cross join in T-SQL? If so what is the syntax?
select *
from tableA
cross ? join tableB
Based on your comments
I am trying to fix my execution plan, my estimated rows are very off
in the nested loop join. I have changed a cursor to a cross join...
The code is faster now with the cross join, but I want to make it even
faster. So I just want to experiment with a join hint...
I have 900 out af 2000000 as actual and estimated for the nested loop
join..And I think it is the step where the cross join is happening...
it is a table from ETL so a lot of new data every day ..
I have a few suggestions
Don't go straight for a cross join. If it's doing a nested loop join because of really bad cardinality estimation, try using a hash join hint instead
It definitely can help to have statistics up-to-date (research the 'Ascending Key Problem' for info). However, you may want to check if your statistics are set to auto-update and whether they get triggered (e.g., after the ETL, view the properties of the statistics to see when they were last updated etc)
Try to fix the bad cardinality estimate. One way is to split the bigger tasks into smaller tasks (e.g., into temporary tables).
On the chance you're using table variables (e.g., DECLARE #temptable TABLE) rather than temporary tables (e.g., CREATE TABLE #TempTable) then stop it. Variables (including table variables) don't have statistics. Older versions often assume 1 row in table variables. SQL Server 2019 (as long as you're in the latest compatibility mode) has some changes to this, but still has some big issues.
When you get it down to the one operation that has the bad cardinality estimate, you can also do things like adding indexes/etc to help with that estimate (remember - you can put indexes and primary keys on temporary tables - they can speed up processing too if the table is accessed multiple times).
I have a snowflake query with multiple ctes and inserting into a table using a Talend job. It takes more than 90 minutes to execute the query. It is multiple cascading ctes, one is calling other and other is calling the other.
I want to improve the performance of the query. It is like 1000 lines of code and I can't paste it here. As I checked the profile and it is showing all the window functions and aggregate functions which slows down the query.
For example, the top slower is,
ROW_NUMBER() OVER (PARTITION BY LOWER(S.SUBSCRIPTIONID)
ORDER BY S.ISROWCURRENT DESC NULLS FIRST,
TO_NUMBER(S.STARTDATE) DESC NULLS FIRST,
IFF(S.ENDDATE IS NULL, '29991231', S.ENDDATE) DESC NULLS FIRST)
takes 7.3% of the time. Can you suggest an alternative way to improve the performance of the query please?
The problem is that 1000 lines are very hard for any query analyzer to optimize. It also makes troubleshooting a lot harder for you and for a future team member who inherits the code.
I recommend breaking the query up and these optimizations:
Use CREATE TEMPORARY TABLE AS instead of CTEs. Add ORDER BY as you create the table on the column that you will join or filter on. The temporary tables are easier for the optimizer to build and later use. The ORDER BY helps Snowflake know what to optimize for with subsequent joins to other tables. They're also easier for troubleshooting.
In your example, see if you can persist this data as permanent columns so that Snowflake can skip the conversion portion and have better statistics on it: TO_NUMBER(S.STARTDATE) and IFF(S.ENDDATE IS NULL, '29991231', S.ENDDATE).
Alternatively to step 2, instead of sorting by startDate and endDate, see if you can add an IDENTITY, SEQUENCE, or populate an INTEGER column which you can use as the sortkey. You can also literally name this new column sortKey. Sorting an integer will be significantly faster than running a function on a DATETIME and then ordering by it.
If any of the CTEs can be changed into materialized views, they will be pre-built and significantly faster.
Finally stage all of the data in a temporary table - ordered by the same columns that your target table was created in - before you insert it. This will make the insert step itself quicker and Snowflake will have an easier time handling a concurrent change to that table.
Notes:
To create a temporary table:
create or replace temporary table table1 as select * from dual; After that you refer to table1 instead of your code instead of the CTE.
Materialized views are documented here. They are an Enterprise edition feature. They syntax is: create materialized view mymv as select col1, col2 from mytable;
So I see this quite often in SQL Server 2012 and have also seen this SQL Server 2008 R2. Suppose I have a query:
Select *
From function1(#param1, #param2) f1
INNER JOIN function2(#param1, #param2) f2 ON
f1.key = f2.key
This will take about 5 minutes to run because I think (and maybe I'm dead wrong but it looks like) it evaluates function2 again when it gets a new row from function1. Now if I rewrite it as:
Select *
Into #f1
From function1(#param1, #param2)
Select *
Into #f2
From function2(#param1, #param2)
Select *
From #f1 f1
INNER JOIN #f2 f2 on
f1.key = f2.key
This will take 3 seconds to run. I don't understand why the optimizer decides to evaluate these scenarios differently. Is there a hint I can use so I don't have to do this workaround? Why is it happening?
You can't exactly compare the use of temporary tables to a function call. When you create the temporary table and then run a query, the compiler knows something important about the tables -- how big they are. This information is then used for execution.
I don't think the functions are called multiple times, even in the first case.
So, I suspect that the issue is the size of the tables and the join algorithm that is then used. In SQL Server 2014, you might try a memory optimized table.
You could also try CTEs, although I don't think that will help (because the CTEs are evaluated after compilation):
with f1 as (
Select *
From function1(#param1, #param2)
),
f2 as (
Select *
From function2(#param1, #param2)
)
Select *
From f1 INNER JOIN
f2
on f1.key = f2.key;
Another option is to use a compiler option to use a hash or merge join.
There are a few things going on here.
First, if you were talking about a scalar UDF, or using CROSS APPLY / OUTER APPLY, then yes, it would run those for each row.
However, in the case of joining two TVFs, you need to consider the following:
Inline TVFs are really just Views that can accept parameters. Because of this, their definition, just like what happens with Views, gets inserted into the query that is using the TVF. This allows the end-result query to be optimizable, and which is why they perform so much better than Multi-statement TVFs. These might JOIN ok in the way that you are using them.
Multi-statement TVFs:
cannot be optimized. They appear to the Query Optimizer as always returning a set number of rows (1000, I believe). This set number of rows can really throw off the execution plan if it is much higher or lower than what really gets returned.
have no means of maintaining statistics on their fields. On the other hand, when you join on columns of temporary tables, SQL Server will generate statistics for those columns and use that info to come up with a more accurate execution plan.
What can you do? Well, I suspect based on the 5 minutes vs 3 seconds difference that your TVFs are Multi-statement instead of Inline. If at all possible, convert them to be Inline TVFs (you will be very glad that you did).
Outside of that, if you have a solution that works in 3 seconds compared to 5 minutes for the alternative, is there really a problem? You could also create the two temporary tables via CREATE TABLE rather than SELECT INTO, which might help a tiny bit.
I have a table with LocationID, Lat, Long of 280,000 records.
I want to insert every variation of matches into a new table.
Example, with records A, B and C I would end up with AB, BC, and AC
My TSQL Query is
INSERT INTO Distances (ID1, ID2, Distance)
SELECT a1.ID, a2.ID, 0
FROM Location a1
JOIN Location a2 ON a1.ID <> a2.ID
I then wish to run another query that will update the Distance column from 0 using a working scalar function and the lat and longs. However, just the insert statement is taking 40 + minutes to run.
I thought I could save the Locations table into a faster database (maybe JsonDB?) but have not experience with other databases and am not sure which would be fastest.
I am running windows 10 and prefer a gui.
The database for processing must allow for scalar style functions that can do math operations on the lat/longs.
Any suggestions?
Make sure you have no indexes defined and add the hint "WITH (TABLOCKX)" after the table name. That should give you "simple logging" on the table and should be somewhat faster.
Also, do the calculation as part of the insert. An update on such a large table will give you a MASSIVE transaction journal, and may even fail because of the size of it. When doing large updates on SQL Server, it can be more effective to create a new table than to update an existing one, because an insert can be persuaded to do simple logging rather than full logging.
You can also halve the size of your table by realising that it is actually symmetrical; run the join as "<" rather than "<>". If you really need both directions you can create a view on top afterwards.
I am re-iterating the question asked by Mongus Pong Why would using a temp table be faster than a nested query? which doesn't have an answer that works for me.
Most of us at some point find that when a nested query reaches a certain complexity it needs to broken into temp tables to keep it performant. It is absurd that this could ever be the most practical way forward and means these processes can no longer be made into a view. And often 3rd party BI apps will only play nicely with views so this is crucial.
I am convinced there must be a simple queryplan setting to make the engine just spool each subquery in turn, working from the inside out. No second guessing how it can make the subquery more selective (which it sometimes does very successfully) and no possibility of correlated subqueries. Just the stack of data the programmer intended to be returned by the self-contained code between the brackets.
It is common for me to find that simply changing from a subquery to a #table takes the time from 120 seconds to 5. Essentially the optimiser is making a major mistake somewhere. Sure, there may be very time consuming ways I could coax the optimiser to look at tables in the right order but even this offers no guarantees. I'm not asking for the ideal 2 second execute time here, just the speed that temp tabling offers me within the flexibility of a view.
I've never posted on here before but I have been writing SQL for years and have read the comments of other experienced people who've also just come to accept this problem and now I would just like the appropriate genius to step forward and say the special hint is X...
There are a few possible explanations as to why you see this behavior. Some common ones are
The subquery or CTE may be being repeatedly re-evaluated.
Materialising partial results into a #temp table may force a more optimum join order for that part of the plan by removing some possible options from the equation.
Materialising partial results into a #temp table may improve the rest of the plan by correcting poor cardinality estimates.
The most reliable method is simply to use a #temp table and materialize it yourself.
Failing that regarding point 1 see Provide a hint to force intermediate materialization of CTEs or derived tables. The use of TOP(large_number) ... ORDER BY can often encourage the result to be spooled rather than repeatedly re evaluated.
Even if that works however there are no statistics on the spool.
For points 2 and 3 you would need to analyse why you weren't getting the desired plan. Possibly rewriting the query to use sargable predicates, or updating statistics might get a better plan. Failing that you could try using query hints to get the desired plan.
I do not believe there is a query hint that instructs the engine to spool each subquery in turn.
There is the OPTION (FORCE ORDER) query hint which forces the engine to perform the JOINs in the order specified, which could potentially coax it into achieving that result in some instances. This hint will sometimes result in a more efficient plan for a complex query and the engine keeps insisting on a sub-optimal plan. Of course, the optimizer should usually be trusted to determine the best plan.
Ideally there would be a query hint that would allow you to designate a CTE or subquery as "materialized" or "anonymous temp table", but there is not.
Another option (for future readers of this article) is to use a user-defined function. Multi-statement functions (as described in How to Share Data between Stored Procedures) appear to force the SQL Server to materialize the results of your subquery. In addition, they allow you to specify primary keys and indexes on the resulting table to help the query optimizer. This function can then be used in a select statement as part of your view. For example:
CREATE FUNCTION SalesByStore (#storeid varchar(30))
RETURNS #t TABLE (title varchar(80) NOT NULL PRIMARY KEY,
qty smallint NOT NULL) AS
BEGIN
INSERT #t (title, qty)
SELECT t.title, s.qty
FROM sales s
JOIN titles t ON t.title_id = s.title_id
WHERE s.stor_id = #storeid
RETURN
END
CREATE VIEW SalesData As
SELECT * FROM SalesByStore('6380')
Having run into this problem, I found out that (in my case) SQL Server was evaluating the conditions in incorrect order, because I had an index that could be used (IDX_CreatedOn on TableFoo).
SELECT bar.*
FROM
(SELECT * FROM TableFoo WHERE Deleted = 1) foo
JOIN TableBar bar ON (bar.FooId = foo.Id)
WHERE
foo.CreatedOn > DATEADD(DAY, -7, GETUTCDATE())
I managed to work around it by forcing the subquery to use another index (i.e. one that would be used when the subquery was executed without the parent query). In my case I switched to PK, which was meaningless for the query, but allowed the conditions from the subquery to be evaluated first.
SELECT bar.*
FROM
(SELECT * FROM TableFoo WITH (INDEX([PK_Id]) WHERE Deleted = 1) foo
JOIN TableBar bar ON (bar.FooId = foo.Id)
WHERE
foo.CreatedOn > DATEADD(DAY, -7, GETUTCDATE())
Filtering by the Deleted column was really simple and filtering the few results by CreatedOn afterwards was even easier. I was able to figure it out by comparing the Actual Execution Plan of the subquery and the parent query.
A more hacky solution (and not really recommended) is to force the subquery to get executed first by limiting the results using TOP, however this could lead to weird problems in the future if the results of the subquery exceed the limit (you could always set the limit to something ridiculous). Unfortunately TOP 100 PERCENT can't be used for this purpose since SQL Server just ignores it.