I have this table in a SQL Server 2008 R2 instance which I have a scheduled process that runs nightly against it. The table can have upward to 500K records in it at any one time. After processing this table I need to remove all rows from it so I am wondering which of the following methods would produce the least overhead (ie Excessive Transaction Log entries):
Truncate Table
Drop and recreate the table
Deleting the contents of the table is out due to time and extra Transaction log entries it makes.
The consensus seems to be Truncation, Thanks everyone!
TRUNCATE TABLE is your best bet. From MSDN:
Removes all rows from a table without logging the individual row
deletes.
So that means it won't bloat your transaction log. Dropping and creating the table not only requires more complex SQL, but also additional permissions. Any settings attached to the table (triggers, GRANT or DENY, etc.) will also have to be re-built.
Truncating the table does not leave row-by-row entries in the transaction log - so neither solution will clutter up your logs too much. If it were me, I'd truncate over having to drop and create each time.
I would go for TRUNCATE TABLE. You can potentially have overheads when indexes, triggers, etc get dropped. Plus you will lose permissions which will also have to be re-created along with any other required objects required for that table.
Also on DROP TABLE in MDSN below it mentions a little gotcha if you execute DROP and CREATE TABLE in the same batch
DROP TABLE and CREATE TABLE should not be executed on the same table
in the same batch. Otherwise an unexpected error may occur.
Dropping the table will destroy any associated objects (indexes, triggers) and may make procedures or views invalid. I would go with truncate, since it won't blow up your log and causes none of the possible issues a drop and create does.
Related
A recent employee of our company had a stored procedure that has gone haywire, and caused mass inserts into a debug table of his. The table is unindexed, is now at close to 1.7 billion rows, and is taking up so much space that the backup no longer fits on the backup drive (Backups now reach close to 250GB).
I haven't really seen anything like this, so I'm seeking advice from the MSSQL Gurus out here.
I know I could nibble away at the table, but being unindexed, the DELETE FROM [TABLE] WHERE ID IN (SELECT TOP 10000 [ID] FROM [TABLE]) nearly locks up the server searching for them.
I also don't want my log file to get massive, it's currently sitting at 480GB on a 1TB drive. If I delete this table, will I be able to shrink it back down? (My recovery mode is simple)
We could index the id field on the table, though we only have around 9 hours downtime a day, and during business hours we can't be locking up the database.
Just looking for advice here, and a point in the right direction.
Thanks.
You may want to consider TRUNCATE
MSDN reference: http://technet.microsoft.com/en-us/library/aa260621(v=sql.80).aspx
Removes all rows from a table without logging the individual row deletes.
Syntax:
TRUNCATE TABLE [YOUR_TABLE]
As #Rahul suggests in the comments, you could also use DROP TABLE [YOUR_TABLE] if you no longer plan to use the table in question. The TRUNCATE option would simply empty the table but leave it in place if you wanted to continue to use it.
With regards to the space issue, both of these operations will be comparatively quick and the space will be reclaimed, but it won't happen instantly. When using TRUNCATE, the data still has to be deleted, but SQL Server will simply deallocate the data pages used by the table and use a background process to actually perform the clean up afterwards.
This post should provide some useful information.
One suggestion would be ... take the back up of only that 1.7 billion rows table (probably in a tape drive/somewhere with good enough space) and then drop the table saying drop table table_name.
That way, if at all that debug table data is needed in future; you have a copy and can restore from backup.
I would remove the logging for this table and launch a delete stored procedure that would commit every 1000 rows.
I just discovered that "truncate {table}" statements are not caught by most database triggers.
Am I taking any risks in terms of performances by replacing such statements by "delete from {table}" ?
I am particularly interested in MSSQL and Sybase.
DELETE will be a lot slower and heavier on resources as proper transaction logs are created. But you can make it part of a bigger transaction, and, as you say, triggers and constraints will be activated, too.
TRUNCATE is considered a DDL operation, just like DROP TABLE. It is very fast, but cannot be done as part of a transaction (i.e. you cannot rollback).
Which one you need depends on your requirements. Do you have triggers that need to run? If so, can you maybe do something instead after or before the TRUNCATE to compensate for them not running?
Truncate can be rolled back in SQL server but not in Sybase. If you have FK references by other tables to this table then you can not use trauncate table.
If you have logic in trigger like say to audit the data or delete the data from other tables based on this delete. You should stick to the delete but if you do not have any kind of triggers then best way would be truncate table.
I have a huge data base with complicated relations, how can I delete all tables contents without violating foreign key constraints,is there a a such way to do that?
note that I am writing a SQL script file to delete tables such as the following example:
delete from A
delete from B
delete from C
delete from D
delete from E
but I don't know what table should I start with.
In SQL Server, there is no native way to do what you're asking. You do have a few options depending on your particular environment limitations:
Figure out the relationships between the tables and start deleting rows out in the appropriate order from foreigns to parents. This may be time-consuming for a large number of objects, but is the "safest" in terms of least destruction.
Disable the foreign key constraints and TRUNCATE TABLE. This will be a bit faster if you're dealing with lots of data, but you still have to to know where all your relationships are. Not too terrible if you're working with fewer tables, though option 1 becomes just as viable
Script out the database objects and DROP DATABASE/CREATE DATABASE. If you don't care about a raw teardown of the database, this is another option, however, you'll still need to be aware of object precedence for creation. SQL Server—as well as third-party tools— offer ways to script object DROP/CREATE. If you decide to go this route, the upside is that you have a scripted backup of all the objects (which I like to keep "just in case") and future tear-downs are nearly instantaneous as long as you keep your scripts synchronized with any changes.
As you can see, it's not a terribly simple process because you're trying to subvert the very reason for the existence of the constraints.
Steps can be:
disable all the constraint in all the tables
delete all the records from all the tables
enable the constraint back again.
Also see this discussion: SQL: delete all the data from all available tables
TRUNCATE TABLE tableName
Removes all rows from a table without
logging the individual row deletions.
TRUNCATE TABLE is similar to the
DELETE statement with no WHERE clause;
however, TRUNCATE TABLE is faster and
uses fewer system and transaction log
resources.
TRUNCATE TABLE (Transact-SQL)
Dude, taking your question at face value... that you want to COMPLETELY recreate the schema with NO data... forget the individual queries (too slow)... just destroydb, and then createdb (or whatever your RDBM's equivalent is)... and you might want to hire a competent DBA.
I am moving a system from a VB/Access app to SQL server. One common thing in the access database is the use of tables to hold data that is being calculated and then using that data for a report.
eg.
delete from treporttable
insert into treporttable (.... this thing and that thing)
Update treportable set x = x * price where (...etc)
and then report runs from treporttable
I have heard that SQL server does not like it when all records from a table are deleted as it creates huge logs etc. I tried temp sql tables but they don't persists long enough for the report which is in a different process to run and report off of.
There are a number of places where this is done to different report tables in the application. The reports can be run many times a day and have a large number of records created in the report tables.
Can anyone tell me if there is a best practise for this or if my information about the logs is incorrect and this code will be fine in SQL server.
If you do not need to log the deletion activity you can use the truncate table command.
From books online:
TRUNCATE TABLE is functionally
identical to DELETE statement with no
WHERE clause: both remove all rows in
the table. But TRUNCATE TABLE is
faster and uses fewer system and
transaction log resources than DELETE.
http://msdn.microsoft.com/en-us/library/aa260621(SQL.80).aspx
delete from sometable
Is going to allow you to rollback the change. So if your table is very large, then this can cause a lot of memory useage and time.
However, if you have no fear of failure then:
truncate sometable
Will perform nearly instantly, and with minimal memory requirements. There is no rollback though.
To Nathan Feger:
You can rollback from TRUNCATE. See for yourself:
CREATE TABLE dbo.Test(i INT);
GO
INSERT dbo.Test(i) SELECT 1;
GO
BEGIN TRAN
TRUNCATE TABLE dbo.Test;
SELECT i FROM dbo.Test;
ROLLBACK
GO
SELECT i FROM dbo.Test;
GO
i
(0 row(s) affected)
i
1
(1 row(s) affected)
You could also DROP the table, and recreate it...if there are no relationships.
The [DROP table] statement is transactionally safe whereas [TRUNCATE] is not.
So it depends on your schema which direction you want to go!!
Also, use SQL Profiler to analyze your execution times. Test it out and see which is best!!
The answer depends on the recovery model of your database. If you are in full recovery mode, then you have transaction logs that could become very large when you delete a lot of data. However, if you're backing up transaction logs on a regular basis to free the space, this might not be a concern for you.
Generally speaking, if the transaction logging doesn't matter to you at all, you should TRUNCATE the table instead. Be mindful, though, of any key seeds, because TRUNCATE will reseed the table.
EDIT: Note that even if the recovery model is set to Simple, your transaction logs will grow during a mass delete. The transaction logs will just be cleared afterward (without releasing the space). The idea is that DELETE will create a transaction even temporarily.
Consider using temporary tables. Their names start with # and they are deleted when nobody refers to them. Example:
create table #myreport (
id identity,
col1,
...
)
Temporary tables are made to be thrown away, and that happens very efficiently.
Another option is using TRUNCATE TABLE instead of DELETE. The truncate will not grow the log file.
I think your example has a possible concurrency issue. What if multiple processes are using the table at the same time? If you add a JOB_ID column or something like that will allow you to clear the relevant entries in this table without clobbering the data being used by another process.
Actually tables such as treporttable do not need to be recovered to a point of time. As such, they can live in a separate database with simple recovery mode. That eases the burden of logging.
There are a number of ways to handle this. First you can move the creation of the data to running of the report itself. This I feel is the best way to handle, then you can use temp tables to temporarily stage your data and no one will have concurency issues if multiple people try to run the report at the same time. Depending on how many reports we are talking about, it could take some time to do this, so you may need another short term solutio n as well.
Second you could move all your reporting tables to a difffernt db that is set to simple mode and truncate them before running your queries to populate. This is closest to your current process, but if multiple users are trying to run the same report could be an issue.
Third you could set up a job to populate the tables (still in separate db set to simple recovery) once a day (truncating at that time). Then anyone running a report that day will see the same data and there will be no concurrency issues. However the data will not be up-to-the minute. You also could set up a reporting data awarehouse, but that is probably overkill in your case.
I erroneously delete all the rows from a MS SQL 2000 table that is used in merge replication (the table is on the publisher). I then compounded the issue by using a DTS operation to retrieve the rows from a backup database and repopulate the table.
This has created the following issue:
The delete operation marked the rows for deletion on the clients but the DTS operation bypasses the replication triggers so the imported rows are not marked for insertion on the subscribers. In effect the subscribers lose the data although it is on the publisher.
So I thought "no worries" I will just delete the rows again and then add them correctly via an insert statement and they will then be marked for insertion on the subscribers.
This is my problem:
I cannot delete the DTSed rows because I get a "Cannot insert duplicate key row in object 'MSmerge_tombstone' with unique index 'uc1MSmerge_tombstone'." error. What I would like to do is somehow delete the rows from the table bypassing the merge replication trigger. Is this possible? I don't want to remove and redo the replication because the subscribers are 50+ windows mobile devices.
Edit: I have tried the Truncate Table command. This gives the following error "Cannot truncate table xxxx because it is published for replication"
Have you tried truncating the table?
You may have to truncate the table and reset the ID field back to 0 if you need the inserted rows to have the same ID. If not, just truncate and it should be fine.
You also could look into temporarily dropping the unique index and adding it back when you're done.
Look into sp_mergedummyupdate
Would creating a second table be an option? You could create a second table, populate it with the needed data, add the constraints/indexes, then drop the first table and rename your second table. This should give you the data with the right keys...and it should all consist of SQL statements that are allowed to trickle down the replication. It just isn't probably the best on performance...and definitely would impose some risk.
I haven't tried this first hand in a replicated environment...but it may be at least worth trying out.
Thanks for the tips...I eventually found a solution:
I deleted the merge delete trigger from the table
Deleted the DTSed rows
Recreated the merge delete trigger
Added my rows correctly using an insert statement.
I was a little worried bout fiddling with the merge triggers but every thing appears to be working correctly.