I saw this question quite a many times but I couldn't get the answer that would satisfy me. Basically what people and books say is "Although temporary tables are deleted when they go out of scope, you should explicitly delete them when they are no longer needed to reduce resource requirements on the server".
It is quite clear to me that when you are working in management studio and creating tables, then until you close your window or disconnect, you will use some resources for that table and it is logically that it is better to drop them.
But when you work with procedure then if you would like to cleanup tables most probably you will do that at the really end of it (I am not talking about the situation when you drop the table as soon as you really do not need that in the procedure). So the workflow is something like that :
When you drop in SP:
Start of SP execution
Doing some stuff
Drop tables
End of execution
And as far as I understand how can it possibly work when you do not drop:
Start of SP execution
Doing some stuff
End of execution
Drop tables
What's the difference here? I can only imagine that some resources are needed to identify the temporary tables. Any other thoughts?
UPDATE:
I ran simple test with 2 SP:
create procedure test as
begin
create table #temp (a int)
insert into #temp values (1);
drop table #temp;
end
and another one without drop statements. I've enabled user statistics and ran the tests:
declare #i int = 0;
while #i < 10000
begin
exec test;
SET #i= #i + 1;
end
That's what I've got (Trial 1-3 dropping table in SP, 4-6 do not dropping)
As the picture shows that all stats are the same or decreased a bit when I do not drop temporary table.
UPDATE2:
I ran this test 2nd time but now with 100k calls and also added SET NOCOUNT ON. These are the results:
As the 2nd run confirmed that if you do not drop the table in SP then you actually save some user time as this is done by some other internal process but outside of the user time.
You can read more about in in this Paul White's article: Temporary Tables in Stored Procedures
CREATE and DROP, Don’t
I will talk about this in much more detail in my next post, but the
key point is that CREATE TABLE and DROP TABLE do not create and drop
temporary tables in a stored procedure, if the temporary object can be
cached. The temporary object is renamed to an internal form when DROP
TABLE is executed, and renamed back to the same user-visible name when
CREATE TABLE is encountered on the next execution. In addition, any
statistics that were auto-created on the temporary table are also
cached. This means that statistics from a previous execution remain
when the procedure is next called.
Technically, a locally scoped temp table (one with a single hashtag before it) will automatically drop out of scope after your SPID is closed. There are some very odd cases where you get a temp table definition cached somewhere and then no real way to remove it. Usually that happens when you have a stored procedure call which is nested and contains a temp table by the same name.
It's good habit to get into dropping your tables when you're done with them but unless something unexpected happens, they should be de-scoped anyway once the proc finishes.
Related
I've created a stored procedure to add data to a table. In mock fashion the steps are:
truncate original table
Select data into the original table
The query that selects data into the original table is quite long (it can take almost a minute to complete), which means that the table is then empty of data for over a minute.
To fix this empty table I changed the stored procedure to:
select data into #temp table
truncate Original table
insert * from #temp into Original
While the stored procedure was running, I did a select * on the original table and it was empty (refreshing, it stayed empty until the stored procedure completed).
Does the truncate happen at the beginning of the procedure no matter where it actually is in the code? If so is there something else I can do to control when the data is deleted?
A very interesting method to move data into a table very quickly is to use partition switching.
Create two staging tables, myStaging1 and myStaging2, with the new data in myStaging2. They must be in the same DB and the same filegroup (so not temp tables or table variables), with the EXACT same columns, PKs, FKs and indexes.
Then run this:
SET XACT_ABORT, NOCOUNT ON; -- force immediate rollback if session is killed
BEGIN TRAN;
ALTER TABLE myTargetTable SWITCH TO myStaging1
WITH ( WAIT_AT_LOW_PRIORITY ( MAX_DURATION = 1 MINUTES, ABORT_AFTER_WAIT = BLOCKERS ));
-- not strictly necessary to use WAIT_AT_LOW_PRIORITY but better for blocking
-- use SELF instead of BLOCKERS to kill your own session
ALTER TABLE myStaging2 SWITCH TO myTargetTable
WITH (WAIT_AT_LOW_PRIORITY (MAX_DURATION = 0 MINUTES, ABORT_AFTER_WAIT = BLOCKERS));
-- force blockers off immediately
COMMIT TRAN;
TRUNCATE TABLE myStaging1;
This is extremely fast, as it's just a metadata change.
You will ask: partitions are only supported on Enterprise Edition (or Developer), how does that help?
Switching non-partitioned tables between each other is still allowed even in Standard or Express Editions.
See this article by Kendra Little for further info on this technique.
The sp is being called by code in an HTTP Get, so I didn't want the table to be empty for over a minute during refresh. When I asked the question I was using a select * from the table to test, but just now I tested by hitting the endpoint in postman and I never received an empty response. So it appears that putting the truncate later in the sp did work.
In MS SQL Server, I'm using a global temp table to store session related information passed by the client and then I use that information inside triggers.
Since the same global temp table can be used in different sessions and it may or may not exist when I want to write into it (depending on whether all the previous sessions which used it before are closed), I'm doing a check for the global temp table existence based on which I create before I write into it.
IF OBJECT_ID('tempdb..##VTT_CONTEXT_INFO_USER_TASK') IS NULL
CREATE TABLE ##VTT_CONTEXT_INFO_USER_TASK (
session_id smallint,
login_time datetime,
HstryUserName VDT_USERNAME,
HstryTaskName VDT_TASKNAME,
)
MERGE ##VTT_CONTEXT_INFO_USER_TASK As target
USING (SELECT ##SPID, #HstryUserName, #HstryTaskName) as source (session_id, HstryUserName, HstryTaskName)
ON (target.session_id = source.session_id)
WHEN MATCHED THEN
UPDATE SET HstryUserName = source.HstryUserName, HstryTaskName = source.HstryTaskName
WHEN NOT MATCHED THEN
INSERT VALUES (##SPID, #LoginTime, source.HstryUserName, source.HstryTaskName);
The problem is that between my check for the table existence and the MERGE statement, SQL Server may drop the temp table if all the sessions which were using it before happen to close in that exact instance (this actually happened in my tests).
Is there a best practice on how to avoid this kind of concurrency issues, that a table is not dropped between the check for its existence and its subsequent use?
The notion of "global temporary table" and "trigger" just do not click. Tables are permanent data stores, as are their attributes -- including triggers. Temporary tables are dropped when the server is re-started. Why would anyone design a system where a permanent block of code (trigger) depends on a temporary shared storage mechanism? It seems like a recipe for failure.
Instead of a global temporary table, use a real table. If you like, put a helpful prefix such as temp_ in front of the name. If the table is being shared by databases, then put it in a database where all code has access.
Create the table once and leave it there (deleting the rows is fine) so the trigger code can access it.
I'll start by saying that, on the long term, I will follow Gordon's advice, i.e. I will take the necessary steps to introduce a normal table in the database to store client application information which needs to be accessible in the triggers.
But since this was not really possible now because of time constrains (it takes weeks to get the necessary formal approvals for a new normal table), I came up with a solution for preventing SQL Server from dropping the global temp table between the check for its existence and the MERGE statement.
There is some information out there about when a global temp table is dropped by SQL Server; my personal tests showed that SQL Server drops a global temp table the moment the session which created it is closed and any other transactions started in other sessions which changed data in that table are finished.
My solution was to fake data changes on the global temp table even before I check for its existence. If the table exists at that moment, SQL Server will then know that it needs to keep it until the current transaction finishes, and it cannot be dropped anymore after the check for its existence. The code looks now like this (properly commented, since it is kind of a hack):
-- Faking a delete on the table ensures that SQL Server will keep the table until the end of the transaction
-- Since ##VTT_CONTEXT_INFO_USER_TASK may actually not exist, we need to fake the delete inside TRY .. CATCH
-- FUTURE 2016, Feb 03: A cleaner solution would use a real table instead of a global temp table.
BEGIN TRY
-- Because schema errors are checked during compile, they cannot be caught using TRY, this can be done by wrapping the query in sp_executesql
DECLARE #QueryText NVARCHAR(100) = 'DELETE ##VTT_CONTEXT_INFO_USER_TASK WHERE 0 = 1'
EXEC sp_executesql #QueryText
END TRY
BEGIN CATCH
-- nothing to do here (see comment above)
END CATCH
IF OBJECT_ID('tempdb..##VTT_CONTEXT_INFO_USER_TASK') IS NULL
CREATE TABLE ##VTT_CONTEXT_INFO_USER_TASK (
session_id smallint,
login_time datetime,
HstryUserName VDT_USERNAME,
HstryTaskName VDT_TASKNAME,
)
MERGE ##VTT_CONTEXT_INFO_USER_TASK As target
USING (SELECT ##SPID, #HstryUserName, #HstryTaskName) as source (session_id, HstryUserName, HstryTaskName)
ON (target.session_id = source.session_id)
WHEN MATCHED THEN
UPDATE SET HstryUserName = source.HstryUserName, HstryTaskName = source.HstryTaskName
WHEN NOT MATCHED THEN
INSERT VALUES (##SPID, #LoginTime, source.HstryUserName, source.HstryTaskName);
Although I would call it a "use it at your own risk" solution, it does prevent that the use of the global temp table in other sessions affects its use in the current one, which was the concern that made me start this thread.
Thanks all for your time! (from text formatting edits to replies)
I have a table with about 300,000 rows and 30 columns. How can I quickly clear it out? If I do a DROP FROM MyTable query, it takes a long time to run. I'm trying the following stored procedure to basically make a copy of the table with no data, drop the original table, and rename the new table as the original:
USE [myDatabase]
GO
SET ANSI_NULLS_ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[ClearTheTable]
AS
BEGIN
SET NOCOUNT ON;
SELECT * INTO tempMyTable FROM MyTable WHERE 1 = 0;
DROP TABLE MyTable
EXEC sp_rename tempMyTable, MyTable;
END
This took nearly 7 minutes to run. Is there a faster way to do it? I don't need any logging, rollback or anything of that nature.
If it matters, I'm calling the stored procedure from a C# app. I guess I could write some code to recreate the table from scratch after doing a DROP TABLE, but I didn't want to have to recompile the app any time a column is added or changed.
Thanks!
EDIT
Here's my updated stored procedure:
USE [myDatabase]
GO
SET ANSI_NULLS_ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[ClearTheTable]
AS
BEGIN
SET NOCOUNT ON;
ALTER DATABASE myDatabase
SET RESTRICTED_USER WITH ROLLBACK IMMEDIATE
TRUNCATE TABLE MyTable
ALTER DATABASE myDatabase
SET MULTI_USER
END
Best way to clear a table is with TRUNCATE.
Since you are creating and droping ill assume you have no constraints.
TRUNCATE TABLE <target table>
Some advantages:
Less transaction log space is used.
The DELETE statement removes rows one at a time and records an entry
in the transaction log for each deleted row. TRUNCATE TABLE removes
the data by deallocating the data pages used to store the table data
and records only the page deallocations in the transaction log.
Fewer locks are typically used.
When the DELETE statement is executed using a row lock, each row in
the table is locked for deletion. TRUNCATE TABLE always locks the
table (including a schema (SCH-M) lock) and page but not each row.
Without exception, zero pages are left in the table.
After a DELETE statement is executed, the table can still contain
empty pages. For example, empty pages in a heap cannot be deallocated
without at least an exclusive (LCK_M_X) table lock. If the delete
operation does not use a table lock, the table (heap) will contain
many empty pages. For indexes, the delete operation can leave empty
pages behind, although these pages will be deallocated quickly by a
background cleanup process.
In a script used for interactive analysis of subsets of data, it is often useful to store the results of queries into temporary tables for further analysis.
Many of my analysis scripts contain this structure:
CREATE TABLE #Results (
a INT NOT NULL,
b INT NOT NULL,
c INT NOT NULL
);
INSERT INTO #Results (a, b, c)
SELECT a, b, c
FROM ...
SELECT *
FROM #Results;
In SQL Server, temporary tables are connection-scoped, so the query results persist after the initial query execution. When the subset of data I want to analyze is expensive to calculate, I use this method instead of using a table variable because the subset persists across different batches of queries.
The setup part of the script is run once, and following queries (SELECT * FROM #Results is a placeholder here) are run as often as necessary.
Occasionally, I want to refresh the subset of data in the temporary table, so I run the entire script again. One way to do this would be to create a new connection by copying the script to a new query window in Management Studio, I find this difficult to manage.
Instead, my usual workaround is to precede the create statement with a conditional drop statement like this:
IF OBJECT_ID(N'tempdb.dbo.#Results', 'U') IS NOT NULL
BEGIN
DROP TABLE #Results;
END;
This statement correctly handles two situations:
On the first run when the table does not exist: do nothing.
On subsequent runs when the table does exist: drop the table.
Production scripts written by me would always use this method because it raises no errors for in the two expected situations.
Some equivalent scripts written by my fellow developers sometimes handle these two situations using exception handling:
BEGIN TRY DROP TABLE #Results END TRY BEGIN CATCH END CATCH
I believe in the database world it is better always to ask permission than seek forgiveness, so this method makes me uneasy.
The second method swallows an error while taking no action to handle non-exceptional behavior (table does not exist). Also, it is possible that an error would be raised for a reason other than that the table does not exist.
The Wise Owl warns about the same thing:
Of the two methods, the [OBJECT_ID method] is more difficult to understand but
probably better: with the [BEGIN TRY method], you run the risk of trapping
the wrong error!
But it does not explain what the practical risks are.
In practice, the BEGIN TRY method has never caused problems in systems I maintain, so I'm happy for it to stay there.
What possible dangers are there in managing temporary table existence using BEGIN TRY method? What unexpected errors are likely to be concealed by the empty catch block?
What possible dangers? What unexpected errors are likely to be concealed?
If try catch block is inside a transaction, it will cause a failure.
BEGIN
BEGIN TRANSACTION t1;
SELECT 1
BEGIN TRY DROP TABLE #Results END TRY BEGIN CATCH END CATCH
COMMIT TRANSACTION t1;
END
This batch will fail with an error like this:
Msg 3930, Level 16, State 1, Line 7
The current transaction cannot be committed and cannot support operations that write to the log file. Roll back the transaction.
Msg 3998, Level 16, State 1, Line 1
Uncommittable transaction is detected at the end of the batch. The transaction is rolled back.
Books Online documents this behavior:
Uncommittable Transactions and XACT_STATE
If an error generated in a TRY block causes the state of the current transaction to be invalidated, the transaction is classified as an uncommittable transaction. An error that ordinarily ends a transaction outside a TRY block causes a transaction to enter an uncommittable state when the error occurs inside a TRY block. An uncommittable transaction can only perform read operations or a ROLLBACK TRANSACTION. The transaction cannot execute any Transact-SQL statements that would generate a write operation or a COMMIT TRANSACTION.
now replace TRY/Catch with the Test Method
BEGIN
BEGIN TRANSACTION t1;
SELECT 1
IF OBJECT_ID(N'tempdb.dbo.#Results', 'U') IS NOT NULL
BEGIN
DROP TABLE #Results;
END;
COMMIT TRANSACTION t1;
END
and run again.Transaction will commit without any error.
A better solution may be to use a table variable rather than a temporary table
ie:
declare #results table(
a INT NOT NULL,
b INT NOT NULL,
c INT NOT NULL
);
I also think that a try block is dangerous because can hide an unexpected problem. Some programing languages can catch only selected errors and don't catch unexpected ones, if your programing language has this functionality then use it (T-SQL can't catch for an specific error)
For your scenario, I can explain that I codify exactly like you, with this try catch block.
The desirable behavior would be:
begin try
drop table #my_temp_table
end try
begin catch __table_dont_exists_error__
end catch
But this don't exists! Then you can write some think like:
begin try
drop table #my_temp_table
end try
begin catch
declare #err_n int, #err_d varchar(MAX)
SELECT
#err_n = ERROR_NUMBER() ,
#err_d = ERROR_MESSAGE() ;
IF #err_n <> 3701
raiserror( #err_d, 16, 1 )
end catch
This will raise an event when error deleting table is different that 'table don't exists'.
Notice that for your issue all this code not worth it. But can be useful for other approach. For your problem, elegant solution is drop table only if exists or use table variable.
Not in you question but possibly overlooked is the resources used by the temp table. I always drop the table at the end of the script so it does not tie up resources. What if you put a million rows in the table? Then I also test for the table at the start of the script to handle the condition there was an error in the last run and the table was not dropped. If you want to reuse the temp then at least clear out the rows.
A table variable is another option. It is lighter weight and has limitations. Avoid a table variable if you are going to use it in a query join as the query optimizer does not handle a table variable was well as it does a temp.
SQL documentation:
If more than one temporary table is created inside a single stored procedure or batch, they must have different names.
If a local temporary table is created in a stored procedure or application that can be executed at the same time by several users, the Database Engine must be able to distinguish the tables created by the different users. The Database Engine does this by internally appending a numeric suffix to each local temporary table name. The full name of a temporary table as stored in the sysobjects table in tempdb is made up of the table name specified in the CREATE TABLE statement and the system-generated numeric suffix. To allow for the suffix, table_name specified for a local temporary name cannot exceed 116 characters.
Temporary tables are automatically dropped when they go out of scope, unless explicitly dropped by using DROP TABLE:
A local temporary table created in a stored procedure is dropped automatically when the stored procedure is finished. The table can be referenced by any nested stored procedures executed by the stored procedure that created the table. The table cannot be referenced by the process that called the stored procedure that created the table.
All other local temporary tables are dropped automatically at the end of the current session.
Global temporary tables are automatically dropped when the session that created the table ends and all other tasks have stopped referencing them. The association between a task and a table is maintained only for the life of a single Transact-SQL statement. This means that a global temporary table is dropped at the completion of the last Transact-SQL statement that was actively referencing the table when the creating session ended.
I have sometimes a problem when running a script. I have the probelm when using an application (that I didn't write and therefore cannot debug) that launches the scripts. This app isn't returning the full error from SQL Server, but just the error description, so I don't know exactly where th error comes.
I have the error only using this tool (it is a tool that sends the queries directly to SQL Server, using a DAC component), if I run the query manuallyin management studio I don't have the error. (This error moreover occurs only on a particular database).
My query is something like:
SELECT * INTO #TEMP_TABLE
FROM ANOTHER_TABLE
GO
--some other commands here
GO
INSERT INTO SOME_OTHER_TABLE(FIELD1,FIELD2)
SELECT FIELDA, FIELDB
FROM #TEMP_TABLE
GO
DROP TABLE #TEMP_TABLE
GO
The error I get is #TEMP_TABLE is not a valid object
So somehow i suspect that the DROP statement is executed before the INSERT statement.
But AFAIK when a GO is there the next statement is not executed until the previous has been completed.
Now I suspoect that this is not true with temp tables... Or do you have another ideas?
Your problem is most likely caused by either an end of session prior to the DROP TABLE causing SQL Server to automatically drop the table or the DROP TABLE is being executed in a different session than the other code (that created and used the temporary table) causing the table not to be visible.
I am assuming that stored procedures are not involved here, because it looks like you are just executing batches, since local temporary tables are also dropped when a stored proc is exited.
There is a good description of local temporary table behavior in this article on Temporary Tables in SQL Server:
You get housekeeping with Local Temporary tables; they are
automatically dropped when they go out of scope, unless explicitly
dropped by using DROP TABLE. Their scope is more generous than a table
Variable so you don't have problems referencing them within batches or
in dynamic SQL. Local temporary tables are dropped automatically at
the end of the current session or procedure. Dropping it at the end of
the procedure that created it can cause head-scratching: a local
temporary table that is created within a stored procedure or session
is dropped when it is finished so it cannot be referenced by the
process that called the stored procedure that created the table. It
can, however, be referenced by any nested stored procedures executed
by the stored procedure that created the table. If the nested
procedure references a temporary table and two temporary tables with
the same name exist at that time, which table is the query is resolved
against?
I would start up SQL Profiler and verify if your tool uses one connection to execute all batches, or if it disconnects/reconnects. Also it could be using a connection pool.
Anyway, executing SQL batches from a file is so simple that you might develop your own tool very quickly and be better off.