Delete all rows in table - sql-server

Normaly i would do a delete * from XXX but on this table thats very slow, it normaly has about 500k to 1m rows in it ( one is a varbinary(MAX) if that mathers ).
Basicly im wondering if there is a quick way to emty the table of all content, its actualy quicker to drop and recreate it then to delete the content via the delete sql statement
The reason i dont want to recreate the table is because its heavly used and delete/recreate i assume will destroy indexs and stats gathered by sql server
Im also hoping there is a way to do this because there is a "clever" way to get row count via sys.sysindexes , so im hoping there is a equaly clever way to delete content

Truncate table is faster than delete * from XXX. Delete is slow because it works one row at a time. There are a few situations where truncate doesn't work, which you can read about on MSDN.

As other have said, TRUNCATE TABLE is far quicker, but it does have some restrictions (taken from here):
You cannot use TRUNCATE TABLE on tables that:
- Are referenced by a FOREIGN KEY constraint. (You can truncate a table that has a foreign key that references itself.)
- Participate in an indexed view.
- Are published by using transactional replication or merge replication.
For tables with one or more of these characteristics, use the DELETE statement instead.
The biggest drawback is that if the table you are trying to empty has foreign keys pointing to it, then the truncate call will fail.

You can rename the table in question, create a table with an identical schema, and then drop the original table at your leisure.
See the MySQL 5.1 Reference Manual for the [RENAME TABLE][1] and [CREATE TABLE][2] commands.
RENAME TABLE tbl TO tbl_old;
CREATE TABLE tbl LIKE tbl_old;
DROP TABLE tbl_old; -- at your leisure
This approach can help minimize application downtime.

I would suggest using TRUNCATE TABLE, it's quicker and uses less resources than DELETE FROM xxx
Here's the related MSDN article

Truncate table in MS Sql Server
Truncate table in Mysql

Related

Partition existing tables using PostgreSQL 10

I have gone through a bunch of documentation for PostgresSQL 10 partitioning but I am still not clear on whether existing tables can be partitioned. Most of the posts mention about partitioning existing tables using PostgreSQL 9.
Also, in the official PostgresSQL website : https://www.postgresql.org/docs/current/static/ddl-partitioning.html, it mentions 'It is not possible to turn a regular table into a partitioned table or vice versa'.
So, my question is can existing tables be partitioned in PostgreSQL 10?
If the answer is YES, my plan is :
Create a partitions
Alter the existing table to include the range so new data goes into the new partition. Once that is done, write a script which loops over the master table and moves the data into the right partitions.
Then, truncate the master table and enforce that nothing can be inserted into it.
If the answer is NO, my plan is to make the existing table the first partition
Create a new parent table and children(partitions).
Perform light transaction which will rename the existing table to a partition table name and the new parent to the actual table name.
Are there better ways to partition existing tables in PostgreSQL 10/9?

are there any options for doing bulk-insert to multiple related tables with entity framework (sql server 2008 r2 target)?

There are existing options for doing bulk insert into a single table with EF entities. Specifically, this SO question and using this class from David Browne.
In the case of trying to bulk insert rows into both a parent and child table, however, nothing jumps out as an option at that same level of convenience.
The 'hacks' I can think of (but I'm hoping there's at least one better option out there) include:
generate the PK's and set the FK's before insert (in this scenario, we know nothing else is inserting at the same time), then do the bulk inserts of both (turning off IDENTITY_INSERT during the parent insert if necessary)
bulk insert (using the linked SO question's approach) the parent rows, select them (enough columns to identify which parent row is which), generate child rows, bulk insert those
generate the sql necessary to insert all the rows in a single batch, doing each parent and then all related children, using ##identity to fill in the FK for the child inserts
The 'pregenerate PK values' approach (I haven't actually tried it) seems fine, but is fragile (requires no other inserts to at least parent during the operation) and depends on either an empty table or selecting max(pk)+1 beforehand.
Since SqlBulkCopy seems to be built around inserting a table at a time (like bcp), anything that still lets sql generate the PK/identity column would seem to be built around 'dropping down' to ado.net and building the sql.
Is there an option outside of 'generate the tons of sql' that I'm missing? If not, is there something out there that already generates the sql for mass-insert into related tables?
Thanks!!
The first rule of any foreign key constraint is that it must exist, as a primary key or unique constraint, in another table before inserted into the foreign key table.
This works great when you are adding a few rows at a time (traditional transaction processing environment). Howevere, you are trying to bulk insert into both at the same time. I'd term this as batch processing. Basically, the bulk update lock on the parent table is going to block the child table from reading it to check that the fk linkage is valid.
I'd say your 2 options would be 1.) leave the fk out entirely or 2.) Set the fk as nocheck before the bulk insert, then turn the check on after the bulk insert is complete with an alter table.

SQL Server Mgmt Studio messing up my Database!

This has effectively ruined my day. I have a larger number of tables with many FK relationships in between. One of the tables (lets call it table A) has a computed column, which is computed via a UDF with schemabinding and is also fulltext indexed.
If I edit any table (lets call it table B) that in any way is related (e.g via FK) to the table with the fulltext indexed computed column (table A), and I save it, the following happens:
Changes to the table (table B) are saved
I get the error: "Column 'abcd' is no fulltext indexed." regarding table A which I didn't even edit, and then "User canceled out of save dialog"
All FK relationships to ALL TABLES from Table B are DELETED
What the hell is going on??? Can someone explain to me how this can happen?
I've had the same kind of problem. As Will A said, the management studio will do the following steps to update a table and its foreign keys:
Create a new table called temp_
Copy contents from old table into new
Drop all constraints, indexes and foreign keys
Drop old table
Rename new table to be = old table
Recreate the foreign keys, indexes and constraints
I may have the first 3 in the wrong order but you get the idea.
In my case I've lost entire tables not just the foreign keys. Personally I don't like the way it does it as it can be VERY time consuming to have to recreate indexes on a table with lots of data in. If its a small change I usually do it myself in T-SQL.
Review the change script before it executes it, make sure it looks sensible.
#OMGPonies, why can't you drop a foreign key if there is data in the table? Of course you can. There are only restrictions on creating foreign keys on tables with data but that is only if it breaks the constraint. However even that can be avoided by using the WITH NOCHECK option when creating the key. Yes I know it'll break when you try to update a broken result set.

Making primary key and identity column after data has been loaded

I have quick question for you SQL gurus. I have existing tables without primary key column and Identity is not set. Now I am trying to modify those tables by making existing integer column as primary key and adding identity values for that column. My question is should I first copy all the records from the table to a temp table before making those changes . Do I loose all the previous records if I ran the T-SQL commnad to make primary key and add identity column on those tables. What are the approaches should I take such as
1) Create temp table to copy all the records from the table to be modified
2) Load all the records to the temptable
3) Make changes on the table schema
4) Finally load the records from the temp table to the original table.
Or
there are better ways that this? I really appreciate your help
Thanks
Tools>Options>Designers>Table and Database Designers
Uncheck "Prevent saving changes that require table re-creation"
[Edit] I've tried this with populated tables and I didn't lose data, but I don't really know much about this.
Hopefully you don't have too many records in the table. What happens if you use Management studio to change an existing field to identity is that it creates another table with the identity field set. it turns identity insert on and inserets the records from the original table, then turns identity insert off. Then it drops the old table and renames the table it just created. This can be quite a lengthy process if you have many records. If so I would script this out and then do it in a job that runs during the off hours because the table will be completely locked while you do this.
just do all of your changes in management studio, copy/paste the generated script into a file. DON'T SAVE CHANGES at this point. Look over and edit that script as necessary, it will probably do almost exactly what you are thinking (it will drop the original table and rename the temp one to the original's name), but handle all constraints and FKs as well.
If your existing integer column is unique and suitable, there should be no problem converting it to a PK.
Another alternative, if you don't want to use the existing column, you can add a new PK columns to the main table, populate it and seed it, then run update statements to update all other tables with new PK.
Whatever way you do it, make sure you do a back-up first!!
You can always add the IDENTITY column after you have finished copying your data around. You can also then reset the IDENTITY seed to the max integer + 1. That should solve your problems.
DBCC CHECKIDENT ('MyTable', RESEED, n)
Where n is the number you want the identity to start at.

Fastest way to delete all the data in a large table

I had to delete all the rows from a log table that contained about 5 million rows. My initial try was to issue the following command in query analyzer:
delete from client_log
which took a very long time.
Check out truncate table which is a lot faster.
I discovered the TRUNCATE TABLE in the msdn transact-SQL reference. For all interested here are the remarks:
TRUNCATE TABLE is functionally identical to DELETE statement with no WHERE clause: both remove all rows in the table. But TRUNCATE TABLE is faster and uses fewer system and transaction log resources than DELETE.
The DELETE statement removes rows one at a time and records an entry in the transaction log for each deleted row. TRUNCATE TABLE removes the data by deallocating the data pages used to store the table's data, and only the page deallocations are recorded in the transaction log.
TRUNCATE TABLE removes all rows from a table, but the table structure and its columns, constraints, indexes and so on remain. The counter used by an identity for new rows is reset to the seed for the column. If you want to retain the identity counter, use DELETE instead. If you want to remove table definition and its data, use the DROP TABLE statement.
You cannot use TRUNCATE TABLE on a table referenced by a FOREIGN KEY constraint; instead, use DELETE statement without a WHERE clause. Because TRUNCATE TABLE is not logged, it cannot activate a trigger.
TRUNCATE TABLE may not be used on tables participating in an indexed view.
There is a common myth that TRUNCATE somehow skips transaction log.
This is misunderstanding, and is clearly mentioned in MSDN.
This myth is invoked in several comments here. Let's eradicate it together ;)
For reference TRUNCATE TABLE also works on MySQL
I use the following method to zero out tables, with the added bonus that it leaves me with an archive copy of the table.
CREATE TABLE `new_table` LIKE `table`;
RENAME TABLE `table` TO `old_table`, `new_table` TO `table`;
forget truncate and delete. maintain your table definitions (in case you want to recreate it) and just use drop table.
truncate table client_log
is your best bet, truncate kills all content in the table and indices and resets any seeds you've got too.
truncate table is not SQL-platform independent. If you suspect that you might ever change database providers, you might be wary of using it.
On SQL Server you can use the Truncate Table command which is faster than a regular delete and also uses less resources. It will reset any identity fields back to the seed value as well.
The drawbacks of truncate are that it can't be used on tables that are referenced by foreign keys and it won't fire any triggers. Also you won't be able to rollback the data if anything goes wrong.
Note that TRUNCATE will also reset any auto incrementing keys, if you are using those.
If you do not wish to lose your auto incrementing keys, you can speed up the delete by deleting in sets (e.g., DELETE FROM table WHERE id > 1 AND id < 10000). It will speed it up significantly and in some cases prevent data from being locked up.
Yes, well, deleting 5 million rows is probably going to take a long time. The only potentially faster way I can think of would be to drop the table, and re-create it. That only works, of course, if you want to delete ALL data in the table.
The suggestion of "Drop and recreate the table" is probably not a good one because that goofs up your foreign keys.
You ARE using foreign keys, right?
If you cannot use TRUNCATE TABLE because of foreign keys and/or triggers, you can consider to:
drop all indexes;
do the usual DELETE;
re-create all indexes.
This may speed up DELETE somewhat.
I am revising my earlier statement:
You should understand that by using
TRUNCATE the data will be cleared but
nothing will be logged to the
transaction log. Writing to the log
is why DELETE will take forever on 5
million rows. I use TRUNCATE often
during development, but you should be
wary about using it on a production
database because you will not be able
to roll back your changes. You should
immediately make a full database
backup after doing a TRUNCATE to
establish a new basis for restoration.
The above statement was intended to prompt you to be sure that you understand there is difference between the two. Unfortunately, it is poorly written and makes unsupported statements as I have not actually done any testing myself between the two. It is based on statements that I have heard from others.
From MSDN:
The DELETE statement removes rows one
at a time and records an entry in the
transaction log for each deleted row.
TRUNCATE TABLE removes the data by
deallocating the data pages used to
store the table's data, and only the
page deallocations are recorded in the
transaction log.
I just wanted to say that there is a fundamental difference between the two and because there is a difference, there will be applications where one or the other may be inappropriate.
DELETE * FROM table_name;
Premature optimization may be dangerous. Optimizing may mean doing something weird, but if it works you may want to take advantage of it.
SELECT DbVendor_SuperFastDeleteAllFunction(tablename, BOZO_BIT) FROM dummy;
For speed I think it depends on...
The underlying database: Oracle, Microsoft, MySQL, PostgreSQL, others, custom...
The table, it's content, and related tables:
There may be deletion rules. Is there an existing procedure to delete all content in the table? Can this be optimized for the specific underlying database engine? How much do we care about breaking things / related data? Performing a DELETE may be the 'safest' way assuming that other related tables do not depend on this table. Are there other tables and queries that are related / depend on the data within this table? If we don't care much about this table being around, using DROP might be a fast method, again depending on the underlying database.
DROP TABLE table_name;
How many rows are being deleted? Is there other information that is quickly gleaned that will optimize the deletion? For example, can we tell if the table is already empty? Can we tell if there are hundreds, thousands, millions, billions of rows?

Resources