Me and another developer are discussing which type of table would be more appropriate for our task. It's basically going to be a cache that we're going to truncate at the end of the day. Personally, I don't see any reason to use anything other than a normal table for this, but he wants to use a global temp table.
Are there any advantages to one or the other?
Use a normal table in tempdb if this is just transient data that you can afford to lose on service restart or a user database if the data is not that transient.
tempdb is slightly more efficient in terms of logging requirements.
Global temp tables get dropped once all referencing connections are the connection that created the table is closed.
Edit: Following #cyberkiwi's edit. BOL does definitely explicitly say
Global temporary tables are visible to
any user and any connection after they
are created, and are deleted when all
users that are referencing the table
disconnect from the instance of SQL
Server.
In my test I wasn't able to get this behaviour though either.
Connection 1
CREATE TABLE ##T (i int)
INSERT INTO ##T values (1)
SET CONTEXT_INFO 0x01
Connection 2
INSERT INTO ##T VALUES(4)
WAITFOR DELAY '00:01'
INSERT INTO ##T VALUES(5)
Connection 3
SELECT OBJECT_ID('tempdb..##T')
declare #killspid varchar(10) = (select 'kill ' + cast(spid as varchar(5)) from sysprocesses where context_info=0x01)
exec (#killspid)
SELECT OBJECT_ID('tempdb..##T') /*NULL - But 2 is still
running let alone disconnected!*/
Global temp table
-ve: As soon as the connection that created the table goes out of scope, it takes
the table with it. This is damaging if you use connection pooling which can swap connections constantly and possibly reset it
-ve: You need to keep checking to see if the table already exists (after restart) and create it if not
+ve: Simple logging in tempdb reduces I/O and CPU activity
Normal table
+ve: Normal logging keeps your cache with your main db. If your "cache" is maintained but is still mission critical, this keeps it consistent together with the db
-ve: follow from above More logging
+ve: The table is always around, and for all connections
If the cache is a something like a quick lookup summary for business/critical data, even if it is reset/truncated at the end of the day, I would prefer to keep it a normal table in the db proper.
Related
I saw this question quite a many times but I couldn't get the answer that would satisfy me. Basically what people and books say is "Although temporary tables are deleted when they go out of scope, you should explicitly delete them when they are no longer needed to reduce resource requirements on the server".
It is quite clear to me that when you are working in management studio and creating tables, then until you close your window or disconnect, you will use some resources for that table and it is logically that it is better to drop them.
But when you work with procedure then if you would like to cleanup tables most probably you will do that at the really end of it (I am not talking about the situation when you drop the table as soon as you really do not need that in the procedure). So the workflow is something like that :
When you drop in SP:
Start of SP execution
Doing some stuff
Drop tables
End of execution
And as far as I understand how can it possibly work when you do not drop:
Start of SP execution
Doing some stuff
End of execution
Drop tables
What's the difference here? I can only imagine that some resources are needed to identify the temporary tables. Any other thoughts?
UPDATE:
I ran simple test with 2 SP:
create procedure test as
begin
create table #temp (a int)
insert into #temp values (1);
drop table #temp;
end
and another one without drop statements. I've enabled user statistics and ran the tests:
declare #i int = 0;
while #i < 10000
begin
exec test;
SET #i= #i + 1;
end
That's what I've got (Trial 1-3 dropping table in SP, 4-6 do not dropping)
As the picture shows that all stats are the same or decreased a bit when I do not drop temporary table.
UPDATE2:
I ran this test 2nd time but now with 100k calls and also added SET NOCOUNT ON. These are the results:
As the 2nd run confirmed that if you do not drop the table in SP then you actually save some user time as this is done by some other internal process but outside of the user time.
You can read more about in in this Paul White's article: Temporary Tables in Stored Procedures
CREATE and DROP, Don’t
I will talk about this in much more detail in my next post, but the
key point is that CREATE TABLE and DROP TABLE do not create and drop
temporary tables in a stored procedure, if the temporary object can be
cached. The temporary object is renamed to an internal form when DROP
TABLE is executed, and renamed back to the same user-visible name when
CREATE TABLE is encountered on the next execution. In addition, any
statistics that were auto-created on the temporary table are also
cached. This means that statistics from a previous execution remain
when the procedure is next called.
Technically, a locally scoped temp table (one with a single hashtag before it) will automatically drop out of scope after your SPID is closed. There are some very odd cases where you get a temp table definition cached somewhere and then no real way to remove it. Usually that happens when you have a stored procedure call which is nested and contains a temp table by the same name.
It's good habit to get into dropping your tables when you're done with them but unless something unexpected happens, they should be de-scoped anyway once the proc finishes.
In MS SQL Server, I'm using a global temp table to store session related information passed by the client and then I use that information inside triggers.
Since the same global temp table can be used in different sessions and it may or may not exist when I want to write into it (depending on whether all the previous sessions which used it before are closed), I'm doing a check for the global temp table existence based on which I create before I write into it.
IF OBJECT_ID('tempdb..##VTT_CONTEXT_INFO_USER_TASK') IS NULL
CREATE TABLE ##VTT_CONTEXT_INFO_USER_TASK (
session_id smallint,
login_time datetime,
HstryUserName VDT_USERNAME,
HstryTaskName VDT_TASKNAME,
)
MERGE ##VTT_CONTEXT_INFO_USER_TASK As target
USING (SELECT ##SPID, #HstryUserName, #HstryTaskName) as source (session_id, HstryUserName, HstryTaskName)
ON (target.session_id = source.session_id)
WHEN MATCHED THEN
UPDATE SET HstryUserName = source.HstryUserName, HstryTaskName = source.HstryTaskName
WHEN NOT MATCHED THEN
INSERT VALUES (##SPID, #LoginTime, source.HstryUserName, source.HstryTaskName);
The problem is that between my check for the table existence and the MERGE statement, SQL Server may drop the temp table if all the sessions which were using it before happen to close in that exact instance (this actually happened in my tests).
Is there a best practice on how to avoid this kind of concurrency issues, that a table is not dropped between the check for its existence and its subsequent use?
The notion of "global temporary table" and "trigger" just do not click. Tables are permanent data stores, as are their attributes -- including triggers. Temporary tables are dropped when the server is re-started. Why would anyone design a system where a permanent block of code (trigger) depends on a temporary shared storage mechanism? It seems like a recipe for failure.
Instead of a global temporary table, use a real table. If you like, put a helpful prefix such as temp_ in front of the name. If the table is being shared by databases, then put it in a database where all code has access.
Create the table once and leave it there (deleting the rows is fine) so the trigger code can access it.
I'll start by saying that, on the long term, I will follow Gordon's advice, i.e. I will take the necessary steps to introduce a normal table in the database to store client application information which needs to be accessible in the triggers.
But since this was not really possible now because of time constrains (it takes weeks to get the necessary formal approvals for a new normal table), I came up with a solution for preventing SQL Server from dropping the global temp table between the check for its existence and the MERGE statement.
There is some information out there about when a global temp table is dropped by SQL Server; my personal tests showed that SQL Server drops a global temp table the moment the session which created it is closed and any other transactions started in other sessions which changed data in that table are finished.
My solution was to fake data changes on the global temp table even before I check for its existence. If the table exists at that moment, SQL Server will then know that it needs to keep it until the current transaction finishes, and it cannot be dropped anymore after the check for its existence. The code looks now like this (properly commented, since it is kind of a hack):
-- Faking a delete on the table ensures that SQL Server will keep the table until the end of the transaction
-- Since ##VTT_CONTEXT_INFO_USER_TASK may actually not exist, we need to fake the delete inside TRY .. CATCH
-- FUTURE 2016, Feb 03: A cleaner solution would use a real table instead of a global temp table.
BEGIN TRY
-- Because schema errors are checked during compile, they cannot be caught using TRY, this can be done by wrapping the query in sp_executesql
DECLARE #QueryText NVARCHAR(100) = 'DELETE ##VTT_CONTEXT_INFO_USER_TASK WHERE 0 = 1'
EXEC sp_executesql #QueryText
END TRY
BEGIN CATCH
-- nothing to do here (see comment above)
END CATCH
IF OBJECT_ID('tempdb..##VTT_CONTEXT_INFO_USER_TASK') IS NULL
CREATE TABLE ##VTT_CONTEXT_INFO_USER_TASK (
session_id smallint,
login_time datetime,
HstryUserName VDT_USERNAME,
HstryTaskName VDT_TASKNAME,
)
MERGE ##VTT_CONTEXT_INFO_USER_TASK As target
USING (SELECT ##SPID, #HstryUserName, #HstryTaskName) as source (session_id, HstryUserName, HstryTaskName)
ON (target.session_id = source.session_id)
WHEN MATCHED THEN
UPDATE SET HstryUserName = source.HstryUserName, HstryTaskName = source.HstryTaskName
WHEN NOT MATCHED THEN
INSERT VALUES (##SPID, #LoginTime, source.HstryUserName, source.HstryTaskName);
Although I would call it a "use it at your own risk" solution, it does prevent that the use of the global temp table in other sessions affects its use in the current one, which was the concern that made me start this thread.
Thanks all for your time! (from text formatting edits to replies)
We're currently working on the following process whose goal is to move data between 2 sets of database servers while maintaining FK's and handling the fact that the destination tables already have rows with overlapping identity column values:
Extract a set of rows from a "root" table and all of its children tables' FK associated data n-levels deep along with related rows that may reside in other databases on the same instance from the source database server.
Place that extracted data set into a set of staging tables on the destination database server.
Rekey the data in the staging tables by reserving block of identities for the destination tables and update all related child staging tables (each of these staging tables will have the same schema as the source/destination table with the addition of a "lNewIdentityID" column).
Insert the data with its new identity into the destination tables in correct order (option SET IDENTITY_INSERT 'desttable' ON will be used obviously).
I'm struggling with the block reservation portion of this process (#3). Our system is pretty much a 24 hour system except for a short weekly maintenance window. Management needs this process to NOT have to wait each week for the maintenance window to migrate data between servers. That being said, I may have 100 insert transactions competing with our migration process while it is on #3. Below is my wag at an attempt to reserve the block of identities, but I'm worried that between "SET #newIdent..." and "DBCC CHECKIDENT..." that an insert transaction will complete and the migration process won't have a "clean" block of identities in a known range that it can use to rekey the staging data.
I essentially need to lock the table, get the current identity, increase the identity, and then unlock the table. I don't know how to do that in T-SQL and am looking for ideas. Thank you.
IF EXISTS (SELECT TOP 1 1 FROM sys.procedures WHERE [name]='DataMigration_ReserveBlock')
DROP PROC DataMigration_ReserveBlock
GO
CREATE PROC DataMigration_ReserveBlock (
#tableName varchar(100),
#blockSize int
)
AS
BEGIN
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
DECLARE #newIdent bigint;
SET #newIdent = #blockSize + IDENT_CURRENT(#tableName);
DBCC CHECKIDENT (#tableName, RESEED, #newIdent);
SELECT #newIdent AS NewIdentity;
END
GO
DataMigration_ReserveBlock 'tblAddress', 1234
You could wrap it in a transaction
BEGIN TRANSACTION
...
COMMIT
It should be fast enough to not cause problems with your other insert processes. Though it would be a good idea to include try / catch logic so you could rollback if problems do occur.
Question 1: I am using a global temp tables in SQL Server 2008. But once my connection is closed this temp is dropped. Is there any way to disable auto drop
Question 2: If two connections are accessing same global temp table and another connection is
trying to delete that global temp table, does SQL Server handles this synchronization properly?
You can create your global temp tables in a stored procedure and mark it with the startup option.
SQL Server maintains a reference count greater than zero for all global temporary tables created within startup procedures.
some example code
CREATE PROC dbo.CreateGlobalTempTables
AS
CREATE TABLE ##my_temp_table
(
fld1 INT NOT NULL PRIMARY KEY,
fld2 INT NULL
);
GO
EXEC dbo.sp_procoption 'dbo.CreateGlobalTempTables', 'startup', 'true';
The global temporary table will be created automatically at startup and persist until someone explicitly drops it.
If you need a table to persist beyond the death of the connection that created it, you should just create a regular table instead of a temporary one. It can still be created in tempdb directly (geting you the benfits of simple logging and auto destruction on server restart) but it's name wouldn't be prefixed by ##.
DROP TABLE is a transactional statement that will block if there are any active connections using that table (with locks).
When the connection that created the ##GlobalTempTable ends, the table will be dropped, unless there is a lock against it.
You could run something like this from the other process to keep the table from being dropped:
BEGIN TRANSACTION
SELECT TOP 1 FROM ##GlobalTempTable WITH (UPDLOCK, HOLDLOCK)
...COMMIT/ROLLBACK
However, when the transaction ends, the table will be dropped. If you can't use a transaction like this, then you should use a permanent table using the Process-Keyed Table method.
The following (sanitized) code sometimes produces these errors:
Cannot drop the table 'database.dbo.Table', because it does not exist or you do not have permission.
There is already an object named 'Table' in the database.
begin transaction
if exists (select 1 from database.Sys.Tables where name ='Table')
begin drop table database.dbo.Table end
Select top 3000 *
into database.dbo.Table
from OtherTable
commit
select * from database.dbo.Table
The code can be run multiple times simultaneously. Anyone know why it breaks?
Can I ask why your doing this first? You should really consider using temporary tables or come up with another solution.
I'm not positive that DDL statments behave the sameway in transactions as DML statements and have seen a blog post with a weird behavior and creating stored procedures within a DDL.
Asside from that you might want to verify your transaction isolation level and set it to Serialized.
Edit
Based on a quick test, I ran the same sql in two different connections, and when I created the table but didn't commit the transaction, the second transaction blocked. So it looks like this should work. I would still caution against this type of design.
In what part of the code are you preventing multiple accesses to this resource?
begin transaction
if exists (select 1 from database.Sys.Tables where name ='Table')
begin drop table database.dbo.Table end
Select top 3000 *
into database.dbo.Table
from OtherTable
commit
Begin transaction isn't doing it. It's only setting up for a commit/rollback scenario on any rows added to tables.
The (if exists, drop) is a race condition, along with the re-creation of the table with (select..into). Mutiliple people dropping into that code all at once will most certainly cause all kinds of errors. Some creating tables that others have just destroyed, others dropping tables that don't exist anymore, and others dropping tables that some are busy inserting into. UGH!
Consider the temp table suggestions of others, or using an application lock to block others from entering this code at all if the critical resource is busy. Transactions on drop/create are not what you want.
If you are just using this table during this process I would suggest using a temp table or , depending on how much data , a ram table. I use ram tables frequently to avoid any transaction costs and save on disk activity.