What is the fastest way to clear a SQL table?

What is the fastest way to clear a SQL table? - sql-server

I have a table with about 300,000 rows and 30 columns. How can I quickly clear it out? If I do a DROP FROM MyTable query, it takes a long time to run. I'm trying the following stored procedure to basically make a copy of the table with no data, drop the original table, and rename the new table as the original:
USE [myDatabase]
GO
SET ANSI_NULLS_ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[ClearTheTable]
AS
BEGIN
SET NOCOUNT ON;
SELECT * INTO tempMyTable FROM MyTable WHERE 1 = 0;
DROP TABLE MyTable
EXEC sp_rename tempMyTable, MyTable;
END
This took nearly 7 minutes to run. Is there a faster way to do it? I don't need any logging, rollback or anything of that nature.
If it matters, I'm calling the stored procedure from a C# app. I guess I could write some code to recreate the table from scratch after doing a DROP TABLE, but I didn't want to have to recompile the app any time a column is added or changed.
Thanks!
EDIT
Here's my updated stored procedure:
USE [myDatabase]
GO
SET ANSI_NULLS_ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[ClearTheTable]
AS
BEGIN
SET NOCOUNT ON;
ALTER DATABASE myDatabase
SET RESTRICTED_USER WITH ROLLBACK IMMEDIATE
TRUNCATE TABLE MyTable
ALTER DATABASE myDatabase
SET MULTI_USER
END

Best way to clear a table is with TRUNCATE.
Since you are creating and droping ill assume you have no constraints.
TRUNCATE TABLE <target table>
Some advantages:
Less transaction log space is used.
The DELETE statement removes rows one at a time and records an entry
in the transaction log for each deleted row. TRUNCATE TABLE removes
the data by deallocating the data pages used to store the table data
and records only the page deallocations in the transaction log.
Fewer locks are typically used.
When the DELETE statement is executed using a row lock, each row in
the table is locked for deletion. TRUNCATE TABLE always locks the
table (including a schema (SCH-M) lock) and page but not each row.
Without exception, zero pages are left in the table.
After a DELETE statement is executed, the table can still contain
empty pages. For example, empty pages in a heap cannot be deallocated
without at least an exclusive (LCK_M_X) table lock. If the delete
operation does not use a table lock, the table (heap) will contain
many empty pages. For indexes, the delete operation can leave empty
pages behind, although these pages will be deallocated quickly by a
background cleanup process.

Related

Stored procedure - truncate table

I've created a stored procedure to add data to a table. In mock fashion the steps are:
truncate original table
Select data into the original table
The query that selects data into the original table is quite long (it can take almost a minute to complete), which means that the table is then empty of data for over a minute.
To fix this empty table I changed the stored procedure to:
select data into #temp table
truncate Original table
insert * from #temp into Original
While the stored procedure was running, I did a select * on the original table and it was empty (refreshing, it stayed empty until the stored procedure completed).
Does the truncate happen at the beginning of the procedure no matter where it actually is in the code? If so is there something else I can do to control when the data is deleted?

A very interesting method to move data into a table very quickly is to use partition switching.
Create two staging tables, myStaging1 and myStaging2, with the new data in myStaging2. They must be in the same DB and the same filegroup (so not temp tables or table variables), with the EXACT same columns, PKs, FKs and indexes.
Then run this:
SET XACT_ABORT, NOCOUNT ON; -- force immediate rollback if session is killed
BEGIN TRAN;
ALTER TABLE myTargetTable SWITCH TO myStaging1
WITH ( WAIT_AT_LOW_PRIORITY ( MAX_DURATION = 1 MINUTES, ABORT_AFTER_WAIT = BLOCKERS ));
-- not strictly necessary to use WAIT_AT_LOW_PRIORITY but better for blocking
-- use SELF instead of BLOCKERS to kill your own session
ALTER TABLE myStaging2 SWITCH TO myTargetTable
WITH (WAIT_AT_LOW_PRIORITY (MAX_DURATION = 0 MINUTES, ABORT_AFTER_WAIT = BLOCKERS));
-- force blockers off immediately
COMMIT TRAN;
TRUNCATE TABLE myStaging1;
This is extremely fast, as it's just a metadata change.
You will ask: partitions are only supported on Enterprise Edition (or Developer), how does that help?
Switching non-partitioned tables between each other is still allowed even in Standard or Express Editions.
See this article by Kendra Little for further info on this technique.

The sp is being called by code in an HTTP Get, so I didn't want the table to be empty for over a minute during refresh. When I asked the question I was using a select * from the table to test, but just now I tested by hitting the endpoint in postman and I never received an empty response. So it appears that putting the truncate later in the sp did work.

Drop or not drop temporary tables in stored procedures

I saw this question quite a many times but I couldn't get the answer that would satisfy me. Basically what people and books say is "Although temporary tables are deleted when they go out of scope, you should explicitly delete them when they are no longer needed to reduce resource requirements on the server".
It is quite clear to me that when you are working in management studio and creating tables, then until you close your window or disconnect, you will use some resources for that table and it is logically that it is better to drop them.
But when you work with procedure then if you would like to cleanup tables most probably you will do that at the really end of it (I am not talking about the situation when you drop the table as soon as you really do not need that in the procedure). So the workflow is something like that :
When you drop in SP:
Start of SP execution
Doing some stuff
Drop tables
End of execution
And as far as I understand how can it possibly work when you do not drop:
Start of SP execution
Doing some stuff
End of execution
Drop tables
What's the difference here? I can only imagine that some resources are needed to identify the temporary tables. Any other thoughts?
UPDATE:
I ran simple test with 2 SP:
create procedure test as
begin
create table #temp (a int)
insert into #temp values (1);
drop table #temp;
end
and another one without drop statements. I've enabled user statistics and ran the tests:
declare #i int = 0;
while #i < 10000
begin
exec test;
SET #i= #i + 1;
end
That's what I've got (Trial 1-3 dropping table in SP, 4-6 do not dropping)
As the picture shows that all stats are the same or decreased a bit when I do not drop temporary table.
UPDATE2:
I ran this test 2nd time but now with 100k calls and also added SET NOCOUNT ON. These are the results:
As the 2nd run confirmed that if you do not drop the table in SP then you actually save some user time as this is done by some other internal process but outside of the user time.

You can read more about in in this Paul White's article: Temporary Tables in Stored Procedures
CREATE and DROP, Don’t
I will talk about this in much more detail in my next post, but the
key point is that CREATE TABLE and DROP TABLE do not create and drop
temporary tables in a stored procedure, if the temporary object can be
cached. The temporary object is renamed to an internal form when DROP
TABLE is executed, and renamed back to the same user-visible name when
CREATE TABLE is encountered on the next execution. In addition, any
statistics that were auto-created on the temporary table are also
cached. This means that statistics from a previous execution remain
when the procedure is next called.

Technically, a locally scoped temp table (one with a single hashtag before it) will automatically drop out of scope after your SPID is closed. There are some very odd cases where you get a temp table definition cached somewhere and then no real way to remove it. Usually that happens when you have a stored procedure call which is nested and contains a temp table by the same name.
It's good habit to get into dropping your tables when you're done with them but unless something unexpected happens, they should be de-scoped anyway once the proc finishes.

How to prevent deadlock of table in SQL Server

I have a table were values can be altered by different users and records of 100k rows.
I made a stored procedure where in, it has a begin tran and at the last part
to either commit or rollback the changes depending on the situation.
So for now the problem we're encountering is a lock of that table. For example 1st user is executing the stored procedure thru the system, then the other users won't be able to select or also execute the stored procedure because the table is currently locked.
So is there anyway where I can avoid lock other than using dirty read. Or a way where I can rollback the changes made without using begin tran, because it is the main reason why the table is locked up.

Yes, you can at least (quick & dirty) enable SNAPSHOT isolation level for transactions. That will prevent locks inside the transactions.
ALTER DATABASE MyDatabase
SET ALLOW_SNAPSHOT_ISOLATION ON
ALTER DATABASE MyDatabase
SET READ_COMMITTED_SNAPSHOT ON
See for details.

Reserving clean block of identity values in T-SQL for data migration

We're currently working on the following process whose goal is to move data between 2 sets of database servers while maintaining FK's and handling the fact that the destination tables already have rows with overlapping identity column values:
Extract a set of rows from a "root" table and all of its children tables' FK associated data n-levels deep along with related rows that may reside in other databases on the same instance from the source database server.
Place that extracted data set into a set of staging tables on the destination database server.
Rekey the data in the staging tables by reserving block of identities for the destination tables and update all related child staging tables (each of these staging tables will have the same schema as the source/destination table with the addition of a "lNewIdentityID" column).
Insert the data with its new identity into the destination tables in correct order (option SET IDENTITY_INSERT 'desttable' ON will be used obviously).
I'm struggling with the block reservation portion of this process (#3). Our system is pretty much a 24 hour system except for a short weekly maintenance window. Management needs this process to NOT have to wait each week for the maintenance window to migrate data between servers. That being said, I may have 100 insert transactions competing with our migration process while it is on #3. Below is my wag at an attempt to reserve the block of identities, but I'm worried that between "SET #newIdent..." and "DBCC CHECKIDENT..." that an insert transaction will complete and the migration process won't have a "clean" block of identities in a known range that it can use to rekey the staging data.
I essentially need to lock the table, get the current identity, increase the identity, and then unlock the table. I don't know how to do that in T-SQL and am looking for ideas. Thank you.
IF EXISTS (SELECT TOP 1 1 FROM sys.procedures WHERE [name]='DataMigration_ReserveBlock')
DROP PROC DataMigration_ReserveBlock
GO
CREATE PROC DataMigration_ReserveBlock (
#tableName varchar(100),
#blockSize int
)
AS
BEGIN
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
DECLARE #newIdent bigint;
SET #newIdent = #blockSize + IDENT_CURRENT(#tableName);
DBCC CHECKIDENT (#tableName, RESEED, #newIdent);
SELECT #newIdent AS NewIdentity;
END
GO
DataMigration_ReserveBlock 'tblAddress', 1234

You could wrap it in a transaction
BEGIN TRANSACTION
...
COMMIT
It should be fast enough to not cause problems with your other insert processes. Though it would be a good idea to include try / catch logic so you could rollback if problems do occur.

TSQL logging inside transaction

I'm trying to write to a log file inside a transaction so that the log survives even if the transaction is rolled back.
--start code
begin tran
insert [something] into dbo.logtable
[[main code here]]
rollback
commit
-- end code
You could say just do the log before the transaction starts but that is not as easy because the transaction starts before this S-Proc is run (i.e. the code is part of a bigger transaction)
So, in short, is there a way to write a special statement inside a transaction that is not part of the transaction. I hope my question makes sense.

Use a table variable (#temp) to hold the log info. Table variables survive a transaction rollback.
See this article.

I do this one of two ways, depending on my needs at the time. Both involve using a variable, which retain their value following a rollback.
1) Create a DECLARE #Log varchar(max) value and use this: #SET #Log=ISNULL(#Log+'; ','')+'Your new log info here'. Keep appending to this as you go through the transaction. I'll insert this into the log after the commit or the rollback as necessary. I'll usually only insert the #Log value into the real log table when there is an error (in theCATCH` block) or If I'm trying to debug a problem.
2) create a DECLARE #LogTable table (RowID int identity(1,1) primary key, RowValue varchar(5000). I insert into this as you progress through your transaction. I like using the OUTPUT clause to insert the actual IDs (and other columns with messages, like 'DELETE item 1234') of rows used in the transaction into this table with. I will insert this table into the actual log table after the commit or the rollback as necessary.

If the parent transaction rolls back the logging data will roll back as well - SQL server does not support proper nested transactions. One possibility is to use a CLR stored procedure to do the logging. This can open its own connection to the database outside the transaction and enter and commit the log data.

Log output to a table, use a time delay, and use WITH(NOLOCK) to see it.
It looks like #arvid wanted to debug the operation of the stored procedure, and is able to alter the stored proc.
The c# code starts a transaction, then calls a s-proc, and at the end it commits or rolls back the transaction. I only have easy access to the s-proc
I had a similar situation. So I modified the stored procedure to log my desired output to a table. Then I put a time delay at the end of the stored procedure
WAITFOR DELAY '00:00:12'; -- 12 second delay, adjust as desired
and in another SSMS window, quickly read the table with READ UNCOMMITTED isolation level (the "WITH(NOLOCK)" below
SELECT * FROM dbo.NicksLogTable WITH(NOLOCK);
It's not the solution you want if you need a permanent record of the logs (edit: including where transactions get rolled back), but it suits my purpose to be able to debug the code in a temporary fashion, especially when linked servers, xp_cmdshell, and creating file tables are all disabled :-(
Apologies for bumping a 12-year old thread, but Microsoft deserves an equal caning for not implementing nested transactions or autonomous transactions in that time period.

If you want to emulate nested transaction behaviour you can use named transactions:
begin transaction a
create table #a (i int)
select * from #a
save transaction b
create table #b (i int)
select * from #a
select * from #b
rollback transaction b
select * from #a
rollback transaction a
In SQL Server if you want a ‘sub-transaction’ you should use save transaction xxxx which works like an oracle checkpoint.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight