Stored procedure - truncate table - sql-server

I've created a stored procedure to add data to a table. In mock fashion the steps are:
truncate original table
Select data into the original table
The query that selects data into the original table is quite long (it can take almost a minute to complete), which means that the table is then empty of data for over a minute.
To fix this empty table I changed the stored procedure to:
select data into #temp table
truncate Original table
insert * from #temp into Original
While the stored procedure was running, I did a select * on the original table and it was empty (refreshing, it stayed empty until the stored procedure completed).
Does the truncate happen at the beginning of the procedure no matter where it actually is in the code? If so is there something else I can do to control when the data is deleted?

A very interesting method to move data into a table very quickly is to use partition switching.
Create two staging tables, myStaging1 and myStaging2, with the new data in myStaging2. They must be in the same DB and the same filegroup (so not temp tables or table variables), with the EXACT same columns, PKs, FKs and indexes.
Then run this:
SET XACT_ABORT, NOCOUNT ON; -- force immediate rollback if session is killed
BEGIN TRAN;
ALTER TABLE myTargetTable SWITCH TO myStaging1
WITH ( WAIT_AT_LOW_PRIORITY ( MAX_DURATION = 1 MINUTES, ABORT_AFTER_WAIT = BLOCKERS ));
-- not strictly necessary to use WAIT_AT_LOW_PRIORITY but better for blocking
-- use SELF instead of BLOCKERS to kill your own session
ALTER TABLE myStaging2 SWITCH TO myTargetTable
WITH (WAIT_AT_LOW_PRIORITY (MAX_DURATION = 0 MINUTES, ABORT_AFTER_WAIT = BLOCKERS));
-- force blockers off immediately
COMMIT TRAN;
TRUNCATE TABLE myStaging1;
This is extremely fast, as it's just a metadata change.
You will ask: partitions are only supported on Enterprise Edition (or Developer), how does that help?
Switching non-partitioned tables between each other is still allowed even in Standard or Express Editions.
See this article by Kendra Little for further info on this technique.

The sp is being called by code in an HTTP Get, so I didn't want the table to be empty for over a minute during refresh. When I asked the question I was using a select * from the table to test, but just now I tested by hitting the endpoint in postman and I never received an empty response. So it appears that putting the truncate later in the sp did work.

Related

Drop or not drop temporary tables in stored procedures

I saw this question quite a many times but I couldn't get the answer that would satisfy me. Basically what people and books say is "Although temporary tables are deleted when they go out of scope, you should explicitly delete them when they are no longer needed to reduce resource requirements on the server".
It is quite clear to me that when you are working in management studio and creating tables, then until you close your window or disconnect, you will use some resources for that table and it is logically that it is better to drop them.
But when you work with procedure then if you would like to cleanup tables most probably you will do that at the really end of it (I am not talking about the situation when you drop the table as soon as you really do not need that in the procedure). So the workflow is something like that :
When you drop in SP:
Start of SP execution
Doing some stuff
Drop tables
End of execution
And as far as I understand how can it possibly work when you do not drop:
Start of SP execution
Doing some stuff
End of execution
Drop tables
What's the difference here? I can only imagine that some resources are needed to identify the temporary tables. Any other thoughts?
UPDATE:
I ran simple test with 2 SP:
create procedure test as
begin
create table #temp (a int)
insert into #temp values (1);
drop table #temp;
end
and another one without drop statements. I've enabled user statistics and ran the tests:
declare #i int = 0;
while #i < 10000
begin
exec test;
SET #i= #i + 1;
end
That's what I've got (Trial 1-3 dropping table in SP, 4-6 do not dropping)
As the picture shows that all stats are the same or decreased a bit when I do not drop temporary table.
UPDATE2:
I ran this test 2nd time but now with 100k calls and also added SET NOCOUNT ON. These are the results:
As the 2nd run confirmed that if you do not drop the table in SP then you actually save some user time as this is done by some other internal process but outside of the user time.
You can read more about in in this Paul White's article: Temporary Tables in Stored Procedures
CREATE and DROP, Don’t
I will talk about this in much more detail in my next post, but the
key point is that CREATE TABLE and DROP TABLE do not create and drop
temporary tables in a stored procedure, if the temporary object can be
cached. The temporary object is renamed to an internal form when DROP
TABLE is executed, and renamed back to the same user-visible name when
CREATE TABLE is encountered on the next execution. In addition, any
statistics that were auto-created on the temporary table are also
cached. This means that statistics from a previous execution remain
when the procedure is next called.
Technically, a locally scoped temp table (one with a single hashtag before it) will automatically drop out of scope after your SPID is closed. There are some very odd cases where you get a temp table definition cached somewhere and then no real way to remove it. Usually that happens when you have a stored procedure call which is nested and contains a temp table by the same name.
It's good habit to get into dropping your tables when you're done with them but unless something unexpected happens, they should be de-scoped anyway once the proc finishes.

ROWLOCK in Stored Procedure with Composite Key in SQL Server

EDITED: I have a table with composite key which is being used by multiple windows services deployed on multiple servers.
Columns:
UserId (int) [CompositeKey],
CheckinTimestamp (bigint) [CompositeKey],
Status (tinyint)
There will be continuous insertion in this table. I want my windows service to select top 10000 rows and do some processing while locking those 10000 rows only. I am using ROWLOCK for this using below stored procedure:
ALTER PROCEDURE LockMonitoringSession
AS
BEGIN
BEGIN TRANSACTION
SELECT TOP 10000 * INTO #TempMonitoringSession FROM dbo.MonitoringSession WITH (ROWLOCK) WHERE [Status] = 0 ORDER BY UserId
DECLARE #UserId INT
DECLARE #CheckinTimestamp BIGINT
DECLARE SessionCursor CURSOR FOR SELECT UserId, CheckinTimestamp FROM #TempMonitoringSession
OPEN SessionCursor
FETCH NEXT FROM SessionCursor INTO #UserId, #CheckinTimestamp
WHILE ##FETCH_STATUS = 0
BEGIN
UPDATE dbo.MonitoringSession SET [Status] = 1 WHERE UserId = #UserId AND CheckinTimestamp = #CheckinTimestamp
FETCH NEXT FROM SessionCursor INTO #UserId, #CheckinTimestamp
END
CLOSE SessionCursor
DEALLOCATE SessionCursor
SELECT * FROM #TempMonitoringSession
DROP TABLE #TempMonitoringSession
COMMIT TRANSACTION
END
But by doing so, dbo.MonitoringSession is being locked permanently until the stored procedure ends. I am not sure what I am doing wrong here.
The only purpose of this stored procedure is to select and update 10000 recent rows without any primary key and ensuring that whole table is not locked because multiple windows services are accessing this table.
Thanks in advance for any help.
(not an answer but too long for comment)
The purpose description should be about why/what for are you updating whole table. Your SP is for updating all rows with Status=0 to set Status=1. So when one of your services decides to run this SP - all rows become non-relevant. I mean, logically event which causes status change already occurred, you just need some time to physically change it in the database. So why do you want other services to read non-relevant rows? Ok, probably you need to read rows available to read (not changed) - but it's not clear again because you are updating whole table.
You may use READPAST hint to skip locked rows and you need rowlocks for that.
Ok, but even with processing of top N rows update of those N rows with one statement would be much faster then looping through this number of rows. You are doing same job but manually.
Check out example of combining UPDLOCK + READPAST to process same queue with parallel processes: https://www.mssqltips.com/sqlservertip/1257/processing-data-queues-in-sql-server-with-readpast-and-updlock/
Small hint - CURSOR STATIC, READONLY, FORWARD_ONLY would do same thing as storing to temp table. Review STATIC option:
https://msdn.microsoft.com/en-us/library/ms180169.aspx
Another thing is a suggestion to think of RCSI. This will avoid other services locking for sure but this is a db-level option so you'll have to test all your functionality. Most of it will work same as before but some scenarios need testing (concurrent transactions won't be locked in situations where they were locked before).
Not clear to me:
what is the percentage of 10000 out of the total number of rows?
is there a clustered index or this is a heap?
what is actual execution plan for select and update?
what are concurrent transactions: inserts or selects?
by the way discovered similar question:
why the entire table is locked while "with (rowlock)" is used in an update statement

MS SQL Server - safe concurrent use of global temp table?

In MS SQL Server, I'm using a global temp table to store session related information passed by the client and then I use that information inside triggers.
Since the same global temp table can be used in different sessions and it may or may not exist when I want to write into it (depending on whether all the previous sessions which used it before are closed), I'm doing a check for the global temp table existence based on which I create before I write into it.
IF OBJECT_ID('tempdb..##VTT_CONTEXT_INFO_USER_TASK') IS NULL
CREATE TABLE ##VTT_CONTEXT_INFO_USER_TASK (
session_id smallint,
login_time datetime,
HstryUserName VDT_USERNAME,
HstryTaskName VDT_TASKNAME,
)
MERGE ##VTT_CONTEXT_INFO_USER_TASK As target
USING (SELECT ##SPID, #HstryUserName, #HstryTaskName) as source (session_id, HstryUserName, HstryTaskName)
ON (target.session_id = source.session_id)
WHEN MATCHED THEN
UPDATE SET HstryUserName = source.HstryUserName, HstryTaskName = source.HstryTaskName
WHEN NOT MATCHED THEN
INSERT VALUES (##SPID, #LoginTime, source.HstryUserName, source.HstryTaskName);
The problem is that between my check for the table existence and the MERGE statement, SQL Server may drop the temp table if all the sessions which were using it before happen to close in that exact instance (this actually happened in my tests).
Is there a best practice on how to avoid this kind of concurrency issues, that a table is not dropped between the check for its existence and its subsequent use?
The notion of "global temporary table" and "trigger" just do not click. Tables are permanent data stores, as are their attributes -- including triggers. Temporary tables are dropped when the server is re-started. Why would anyone design a system where a permanent block of code (trigger) depends on a temporary shared storage mechanism? It seems like a recipe for failure.
Instead of a global temporary table, use a real table. If you like, put a helpful prefix such as temp_ in front of the name. If the table is being shared by databases, then put it in a database where all code has access.
Create the table once and leave it there (deleting the rows is fine) so the trigger code can access it.
I'll start by saying that, on the long term, I will follow Gordon's advice, i.e. I will take the necessary steps to introduce a normal table in the database to store client application information which needs to be accessible in the triggers.
But since this was not really possible now because of time constrains (it takes weeks to get the necessary formal approvals for a new normal table), I came up with a solution for preventing SQL Server from dropping the global temp table between the check for its existence and the MERGE statement.
There is some information out there about when a global temp table is dropped by SQL Server; my personal tests showed that SQL Server drops a global temp table the moment the session which created it is closed and any other transactions started in other sessions which changed data in that table are finished.
My solution was to fake data changes on the global temp table even before I check for its existence. If the table exists at that moment, SQL Server will then know that it needs to keep it until the current transaction finishes, and it cannot be dropped anymore after the check for its existence. The code looks now like this (properly commented, since it is kind of a hack):
-- Faking a delete on the table ensures that SQL Server will keep the table until the end of the transaction
-- Since ##VTT_CONTEXT_INFO_USER_TASK may actually not exist, we need to fake the delete inside TRY .. CATCH
-- FUTURE 2016, Feb 03: A cleaner solution would use a real table instead of a global temp table.
BEGIN TRY
-- Because schema errors are checked during compile, they cannot be caught using TRY, this can be done by wrapping the query in sp_executesql
DECLARE #QueryText NVARCHAR(100) = 'DELETE ##VTT_CONTEXT_INFO_USER_TASK WHERE 0 = 1'
EXEC sp_executesql #QueryText
END TRY
BEGIN CATCH
-- nothing to do here (see comment above)
END CATCH
IF OBJECT_ID('tempdb..##VTT_CONTEXT_INFO_USER_TASK') IS NULL
CREATE TABLE ##VTT_CONTEXT_INFO_USER_TASK (
session_id smallint,
login_time datetime,
HstryUserName VDT_USERNAME,
HstryTaskName VDT_TASKNAME,
)
MERGE ##VTT_CONTEXT_INFO_USER_TASK As target
USING (SELECT ##SPID, #HstryUserName, #HstryTaskName) as source (session_id, HstryUserName, HstryTaskName)
ON (target.session_id = source.session_id)
WHEN MATCHED THEN
UPDATE SET HstryUserName = source.HstryUserName, HstryTaskName = source.HstryTaskName
WHEN NOT MATCHED THEN
INSERT VALUES (##SPID, #LoginTime, source.HstryUserName, source.HstryTaskName);
Although I would call it a "use it at your own risk" solution, it does prevent that the use of the global temp table in other sessions affects its use in the current one, which was the concern that made me start this thread.
Thanks all for your time! (from text formatting edits to replies)

What is the fastest way to clear a SQL table?

I have a table with about 300,000 rows and 30 columns. How can I quickly clear it out? If I do a DROP FROM MyTable query, it takes a long time to run. I'm trying the following stored procedure to basically make a copy of the table with no data, drop the original table, and rename the new table as the original:
USE [myDatabase]
GO
SET ANSI_NULLS_ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[ClearTheTable]
AS
BEGIN
SET NOCOUNT ON;
SELECT * INTO tempMyTable FROM MyTable WHERE 1 = 0;
DROP TABLE MyTable
EXEC sp_rename tempMyTable, MyTable;
END
This took nearly 7 minutes to run. Is there a faster way to do it? I don't need any logging, rollback or anything of that nature.
If it matters, I'm calling the stored procedure from a C# app. I guess I could write some code to recreate the table from scratch after doing a DROP TABLE, but I didn't want to have to recompile the app any time a column is added or changed.
Thanks!
EDIT
Here's my updated stored procedure:
USE [myDatabase]
GO
SET ANSI_NULLS_ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[ClearTheTable]
AS
BEGIN
SET NOCOUNT ON;
ALTER DATABASE myDatabase
SET RESTRICTED_USER WITH ROLLBACK IMMEDIATE
TRUNCATE TABLE MyTable
ALTER DATABASE myDatabase
SET MULTI_USER
END
Best way to clear a table is with TRUNCATE.
Since you are creating and droping ill assume you have no constraints.
TRUNCATE TABLE <target table>
Some advantages:
Less transaction log space is used.
The DELETE statement removes rows one at a time and records an entry
in the transaction log for each deleted row. TRUNCATE TABLE removes
the data by deallocating the data pages used to store the table data
and records only the page deallocations in the transaction log.
Fewer locks are typically used.
When the DELETE statement is executed using a row lock, each row in
the table is locked for deletion. TRUNCATE TABLE always locks the
table (including a schema (SCH-M) lock) and page but not each row.
Without exception, zero pages are left in the table.
After a DELETE statement is executed, the table can still contain
empty pages. For example, empty pages in a heap cannot be deallocated
without at least an exclusive (LCK_M_X) table lock. If the delete
operation does not use a table lock, the table (heap) will contain
many empty pages. For indexes, the delete operation can leave empty
pages behind, although these pages will be deallocated quickly by a
background cleanup process.

TSQL logging inside transaction

I'm trying to write to a log file inside a transaction so that the log survives even if the transaction is rolled back.
--start code
begin tran
insert [something] into dbo.logtable
[[main code here]]
rollback
commit
-- end code
You could say just do the log before the transaction starts but that is not as easy because the transaction starts before this S-Proc is run (i.e. the code is part of a bigger transaction)
So, in short, is there a way to write a special statement inside a transaction that is not part of the transaction. I hope my question makes sense.
Use a table variable (#temp) to hold the log info. Table variables survive a transaction rollback.
See this article.
I do this one of two ways, depending on my needs at the time. Both involve using a variable, which retain their value following a rollback.
1) Create a DECLARE #Log varchar(max) value and use this: #SET #Log=ISNULL(#Log+'; ','')+'Your new log info here'. Keep appending to this as you go through the transaction. I'll insert this into the log after the commit or the rollback as necessary. I'll usually only insert the #Log value into the real log table when there is an error (in theCATCH` block) or If I'm trying to debug a problem.
2) create a DECLARE #LogTable table (RowID int identity(1,1) primary key, RowValue varchar(5000). I insert into this as you progress through your transaction. I like using the OUTPUT clause to insert the actual IDs (and other columns with messages, like 'DELETE item 1234') of rows used in the transaction into this table with. I will insert this table into the actual log table after the commit or the rollback as necessary.
If the parent transaction rolls back the logging data will roll back as well - SQL server does not support proper nested transactions. One possibility is to use a CLR stored procedure to do the logging. This can open its own connection to the database outside the transaction and enter and commit the log data.
Log output to a table, use a time delay, and use WITH(NOLOCK) to see it.
It looks like #arvid wanted to debug the operation of the stored procedure, and is able to alter the stored proc.
The c# code starts a transaction, then calls a s-proc, and at the end it commits or rolls back the transaction. I only have easy access to the s-proc
I had a similar situation. So I modified the stored procedure to log my desired output to a table. Then I put a time delay at the end of the stored procedure
WAITFOR DELAY '00:00:12'; -- 12 second delay, adjust as desired
and in another SSMS window, quickly read the table with READ UNCOMMITTED isolation level (the "WITH(NOLOCK)" below
SELECT * FROM dbo.NicksLogTable WITH(NOLOCK);
It's not the solution you want if you need a permanent record of the logs (edit: including where transactions get rolled back), but it suits my purpose to be able to debug the code in a temporary fashion, especially when linked servers, xp_cmdshell, and creating file tables are all disabled :-(
Apologies for bumping a 12-year old thread, but Microsoft deserves an equal caning for not implementing nested transactions or autonomous transactions in that time period.
If you want to emulate nested transaction behaviour you can use named transactions:
begin transaction a
create table #a (i int)
select * from #a
save transaction b
create table #b (i int)
select * from #a
select * from #b
rollback transaction b
select * from #a
rollback transaction a
In SQL Server if you want a ‘sub-transaction’ you should use save transaction xxxx which works like an oracle checkpoint.

Resources