If I execute a procedure that drops a table and then recreate it using 'SELECT INTO'.
IF that procedure raises an exception after dropping the table, does table dropping take place or not?
Unless you wrap them in a transaction,table will be dropped since each statement will be considered as an implicit transaction..
below are some tests
create table t1
(
id int not null primary key
)
drop table t11
insert into t1
select 1 union all select 1
table t11 will be dropped,even though insert will raise an exception..
one more example..
drop table orderstest
print 'dropped table'
waitfor delay '00:00:05'
select * into orderstest
from Orders
now after 2 seconds,kill session and you can still see orderstest being dropped
I checked with some other statements other than select into ,i don't see a reason why select into will behave differently and this applies even if you wrap statements in a stored proc..
IF you want to rollback all,use a transaction or more better use set xact_Abort on
Yes, the dropped table will be gone. I have had this issue when I script a new primary key. Depending on the table, it saves all the data to a table variable in memory, drops the table, creates a new one with the new pk, then loads the data. If the data violates the new pk, the statement fails and the table variable is dropped leaving me with a new table and no data.
My practice is to create the new table with a slightly different name, load the data, change both table names in a statement, then once all the data is confirmed loaded, drop the original table.
Related
Suppose a table in SQLServer with this structure:
TABLE t (Id INT PRIMARY KEY)
Then I have a stored procedure, which is constantly being called, that works inserting data in this table among other kind of things:
BEGIN TRAN
DECLARE #Id INT = SELECT MAX(Id) + 1 FROM t
INSERT t VALUES (#Id)
...
-- Stuff that gets a long time to get completed
...
COMMIT
The problem with this aproach is sometimes I get a primary key violation because 2 or more procedure calls get and try to insert the same Id on the table.
I have been able to solve this problem adding a tablock in the SELECT sentence:
DECLARE #Id INT = SELECT MAX(Id) + 1 FROM t WITH (TABLOCK)
The problem now is sucessive calls to the procedure must wait to the completion of the transaction currently beeing executed to start their work, allowing just one procedure to run simultaneosly.
Is there any advice or trick to get the lock just during the execution of the select and insert sentence?
Thanks.
TABLOCK is a terrible idea, since you're serialising all the calls (no concurrency).
Note that with an SP you will retain all the locks granted over the run until the SP completes.
So you want to minimise locks except for where you really need them.
Unless you have a special case, use an internally generated id:
CREATE TABLE t (Id INT IDENTITY PRIMARY KEY)
Improved performance, concurrency etc. since you are not dependent on external tables to manage the id.
If you have existing data you can (re)set the start value using DBCC
DBCC CHECKIDENT ('t', RESEED, 100)
If you need to inject rows with a value preassigned, use:
SET IDENTITY_INSERT t ON
(and off again afterwards, resetting the seed as required).
[Consider whether you want this value to be the primary key, or simply unique.
In many cases where you need to reference a tables PK as a FK then you'll want it as PK for simplicity of join, but having a business readable value (eg, Accounting Code or OrderNo+OrderLine is completely valid) : that's just modelling]
I have this trigger in SQL Server:
CREATE TRIGGER MyTrigger ON [dbo].[practiseduplicates]
AFTER INSERT
AS
IF EXISTS (SELECT *
FROM [practiseduplicates] t
INNER JOIN inserted i ON i.[money] = t.[money]
AND i.[Name] = t.[Name]
AND i.[year month] = t.[year month])
BEGIN
ROLLBACK
RAISERROR ('Duplicated Data', 16, 1);
END
I then insert these values (which are already in the data table):
insert into [practiseduplicates]
values ('2017-02', 'buzzlightyear', '10.09')
When I click execute I expected the error message to pop up... which it did, however when I change the values to information that I know is not in the data table
e.g.
'2056-12', 'mr potato head', '12345.09'
The error message still pops up, when in actual fact it should have just inserted the data into the table, does anyone know why this is the case?
I suspect its to do with my inner join statement but I am not sure.
Quoted from your question
When I click execute I expected the error message to pop up... which it did, however when I change the values to information that I know is not in the data table
That italic part of the statement is not quite accurate, because even if those values were not in the table before, after you run insert, they are in there. And the trigger will fire.
In short, you are creating an AFTER INSERT trigger and check if the data inserted to the table is already in the table (after the insert is ran). Of course the trigger will fire every time because if the data is in the inserted table, it is in the table (because it was just inserted).
So basically I used a constraint to solve this rather than a trigger statement.
ALTER TABLE [dbo].[practiseduplicates]
ADD CONSTRAINT [constraintforduplicates] UNIQUE NONCLUSTERED
( [column name],
[column name], etc etc
)
This stops any duplicates using the columns you state in the constraint statement as the data to compare to and flag if they're duplicates and doesn't insert the data into the table.
Note that the columns in the statement have 900 byte constraint themselves. So e.g if you had varchar (max) column, the constraint would not run as the maximum it can do is 900 bytes. In my script I put varchar (800).
When using temp tables in SQL Server stored procs, is the preferred practice to;
1) Create the temp table, populate it, use it then drop it
CREATE TABLE #MyTable ( ... )
-- Do stuff
DROP TABLE #MyTable
2) Check if it exists, drop it if it does, then create and use it
IF object_id('tempdb..#MyTable') IS NOT NULL
DROP TABLE #MyTable
CREATE TABLE #MyTable ( ... )
3) Create it and let SQL Server clean it up when it goes out of scope
CREATE TABLE #MyTable ( ... )
-- Do Stuff
I read in this answer and its associated comments, that this can be useful in situations where the temp table is reused that SQL Server will truncate the table but keep the structure to save time.
My stored proc is likely to be called pretty frequently, but it only contains a few columns, so I don't know how advantageous this really is in my situation.
You could test and see if one method outperforms another in your scenario. I've heard about this reuse benefit but I haven't performed any extensive tests myself. (My gut instinct is to explicitly drop any #temp objects I've created.)
In a single stored procedure you should never have to check if the table exists - unless it is also possible that the procedure is being called from another procedure that might have created a table with the same name. This is why it is good practice to name #temp tables meaningfully instead of using #t, #x, #y etc.
I follow this approach:
IF object_id('tempdb..#MyTable') IS NOT NULL
DROP TABLE #MyTable
CREATE TABLE #MyTable ( ... )
// Do Stuff
IF object_id('tempdb..#MyTable') IS NOT NULL
DROP TABLE #MyTable
Reason: In case if some error occurs in sproc, and created temp table is not dropped and when the same sproc is called with check for existence, it will raise error that table cannot be created, and will never get successfully executed unless the table is dropped. So always perform check for the existence of and object before creating it.
When using temp tables my preferred practice is actually a combination of 1 and 2.
IF object_id('tempdb..#MyTable') IS NOT NULL
DROP TABLE #MyTable
CREATE TABLE #MyTable ( ... )
// Do Stuff
IF object_id('tempdb..#MyTable') IS NOT NULL
DROP TABLE #MyTable
To add a NOT NULL Column to a table with many records, a DEFAULT constraint needs to be applied. This constraint causes the entire ALTER TABLE command to take a long time to run if the table is very large. This is because:
Assumptions:
The DEFAULT constraint modifies existing records. This means that the db needs to increase the size of each record, which causes it to shift records on full data-pages to other data-pages and that takes time.
The DEFAULT update executes as an atomic transaction. This means that the transaction log will need to be grown so that a roll-back can be executed if necessary.
The transaction log keeps track of the entire record. Therefore, even though only a single field is modified, the space needed by the log will be based on the size of the entire record multiplied by the # of existing records. This means that adding a column to a table with small records will be faster than adding a column to a table with large records even if the total # of records are the same for both tables.
Possible solutions:
Suck it up and wait for the process to complete. Just make sure to set the timeout period to be very long. The problem with this is that it may take hours or days to do depending on the # of records.
Add the column but allow NULL. Afterward, run an UPDATE query to set the DEFAULT value for existing rows. Do not do UPDATE *. Update batches of records at a time or you'll end up with the same problem as solution #1. The problem with this approach is that you end up with a column that allows NULL when you know that this is an unnecessary option. I believe that there are some best practice documents out there that says that you should not have columns that allow NULL unless it's necessary.
Create a new table with the same schema. Add the column to that schema. Transfer the data over from the original table. Drop the original table and rename the new table. I'm not certain how this is any better than #1.
Questions:
Are my assumptions correct?
Are these my only solutions? If so, which one is the best? I f not, what else could I do?
I ran into this problem for my work also. And my solution is along #2.
Here are my steps (I am using SQL Server 2005):
1) Add the column to the table with a default value:
ALTER TABLE MyTable ADD MyColumn varchar(40) DEFAULT('')
2) Add a NOT NULL constraint with the NOCHECK option. The NOCHECK does not enforce on existing values:
ALTER TABLE MyTable WITH NOCHECK
ADD CONSTRAINT MyColumn_NOTNULL CHECK (MyColumn IS NOT NULL)
3) Update the values incrementally in table:
GO
UPDATE TOP(3000) MyTable SET MyColumn = '' WHERE MyColumn IS NULL
GO 1000
The update statement will only update maximum 3000 records. This allow to save a chunk of data at the time. I have to use "MyColumn IS NULL" because my table does not have a sequence primary key.
GO 1000 will execute the previous statement 1000 times. This will update 3 million records, if you need more just increase this number. It will continue to execute until SQL Server returns 0 records for the UPDATE statement.
Here's what I would try:
Do a full backup of the database.
Add the new column, allowing nulls - don't set a default.
Set SIMPLE recovery, which truncates the tran log as soon as each batch is committed.
The SQL is: ALTER DATABASE XXX SET RECOVERY SIMPLE
Run the update in batches as you discussed above, committing after each one.
Reset the new column to no longer allow nulls.
Go back to the normal FULL recovery.
The SQL is: ALTER DATABASE XXX SET RECOVERY FULL
Backup the database again.
The use of the SIMPLE recovery model doesn't stop logging, but it significantly reduces its impact. This is because the server discards the recovery information after every commit.
You could:
Start a transaction.
Grab a write lock on your original table so no one writes to it.
Create a shadow table with the new schema.
Transfer all the data from the original table.
execute sp_rename to rename the old table out.
execute sp_rename to rename the new table in.
Finally, you commit the transaction.
The advantage of this approach is that your readers will be able to access the table during the long process and that you can perform any kind of schema change in the background.
Just to update this with the latest information.
In SQL Server 2012 this can now be carried out as an online operation in the following circumstances
Enterprise Edition only
The default must be a runtime constant
For the second requirement examples might be a literal constant or a function such as GETDATE() that evaluates to the same value for all rows. A default of NEWID() would not qualify and would still end up updating all rows there and then.
For defaults that qualify SQL Server evaluates them and stores the result as the default value in the column metadata so this is independent of the default constraint which is created (which can even be dropped if no longer required). This is viewable in sys.system_internals_partition_columns. The value doesn't get written out to the rows until next time they happen to get updated.
More details about this here: online non-null with values column add in sql server 2012
Admitted that this is an old question. My colleague recently told me that he was able to do it in one single alter table statement on a table with 13.6M rows. It finished within a second in SQL Server 2012. I was able to confirm the same on a table with 8M rows. Something changed in later version of SQL Server?
Alter table mytable add mycolumn char(1) not null default('N');
I think this depends on the SQL flavor you are using, but what if you took option 2, but at the very end alter table table to not null with the default value?
Would it be fast, since it sees all the values are not null?
If you want the column in the same table, you'll just have to do it. Now, option 3 is potentially the best for this because you can still have the database "live" while this operation is going on. If you use option 1, the table is locked while the operation happens and then you're really stuck.
If you don't really care if the column is in the table, then I suppose a segmented approach is the next best. Though, I really try to avoid that (to the point that I don't do it) because then like Charles Bretana says, you'll have to make sure and find all the places that update/insert that table and modify those. Ugh!
I had a similar problem, and went for your option #2.
It takes 20 minutes this way, as opposed to 32 hours the other way!!! Huge difference, thanks for the tip.
I wrote a full blog entry about it, but here's the important sql:
Alter table MyTable
Add MyNewColumn char(10) null default '?';
go
update MyTable set MyNewColumn='?' where MyPrimaryKey between 0 and 1000000
go
update MyTable set MyNewColumn='?' where MyPrimaryKey between 1000000 and 2000000
go
update MyTable set MyNewColumn='?' where MyPrimaryKey between 2000000 and 3000000
go
..etc..
Alter table MyTable
Alter column MyNewColumn char(10) not null;
And the blog entry if you're interested:
http://splinter.com.au/adding-a-column-to-a-massive-sql-server-table
I had a similar problem and I went with modified #3 approach. In my case the database was in SIMPLE recovery mode and the table to which column was supposed to be added was not referenced by any FK constraints.
Instead of creating a new table with the same schema and copying contents of original table, I used SELECT…INTO syntax.
According to Microsoft (http://technet.microsoft.com/en-us/library/ms188029(v=sql.105).aspx)
The amount of logging for SELECT...INTO depends on the recovery model
in effect for the database. Under the simple recovery model or
bulk-logged recovery model, bulk operations are minimally logged. With
minimal logging, using the SELECT… INTO statement can be more
efficient than creating a table and then populating the table with an
INSERT statement. For more information, see Operations That Can Be
Minimally Logged.
The sequence of steps :
1.Move data from old table to new while adding new column with default
SELECT table.*, cast (‘default’ as nvarchar(256)) new_column
INTO table_copy
FROM table
2.Drop old table
DROP TABLE table
3.Rename newly created table
EXEC sp_rename 'table_copy', ‘table’
4.Create necessary constraints and indexes on the new table
In my case the table had more than 100 million rows and this approach completed faster than approach #2 and log space growth was minimal.
1) Add the column to the table with a default value:
ALTER TABLE MyTable ADD MyColumn int default 0
2) Update the values incrementally in the table (same effect as accepted answer). Adjust the number of records being updated to your environment, to avoid blocking other users/processes.
declare #rowcount int = 1
while (#rowcount > 0)
begin
UPDATE TOP(10000) MyTable SET MyColumn = 0 WHERE MyColumn IS NULL
set #rowcount = ##ROWCOUNT
end
3) Alter the column definition to require not null. Run the following at a moment when the table is not in use (or schedule a few minutes of downtime). I have successfully used this for tables with millions of records.
ALTER TABLE MyTable ALTER COLUMN MyColumn int NOT NULL
I would use CURSOR instead of UPDATE. Cursor will update all matching records in batch, record by record -- it takes time but not locks table.
If you want to avoid locks use WAIT.
Also I am not sure, that DEFAULT constrain changes existing rows.
Probably NOT NULL constrain use together with DEFAULT causes case described by author.
If it changes add it in the end
So pseudocode will look like:
-- without NOT NULL constrain -- we will add it in the end
ALTER TABLE table ADD new_column INT DEFAULT 0
DECLARE fillNullColumn CURSOR LOCAL FAST_FORWARD
SELECT
key
FROM
table WITH (NOLOCK)
WHERE
new_column IS NULL
OPEN fillNullColumn
DECLARE
#key INT
FETCH NEXT FROM fillNullColumn INTO #key
WHILE ##FETCH_STATUS = 0 BEGIN
UPDATE
table WITH (ROWLOCK)
SET
new_column = 0 -- default value
WHERE
key = #key
WAIT 00:00:05 --wait 5 seconds, keep in mind it causes updating only 12 rows per minute
FETCH NEXT FROM fillNullColumn INTO #key
END
CLOSE fillNullColumn
DEALLOCATE fillNullColumn
ALTER TABLE table ALTER COLUMN new_column ADD CONSTRAIN xxx
I am sure that there are some syntax errors, but I hope that this
help to solve your problem.
Good luck!
Vertically segment the table. This means you will have two tables, with the same primary key, and exactly the same number of records... One will be the one you already have, the other will have just the key, and the new Non-Null column (with default value) .
Modify all Insert, Update, and delete code so they keep the two tables in synch... If you want you can create a view that "joins" the two tables together to create a single logical combination of the two that appears like a single table for client Select statements...
Is there an easy way to remove an identity from a table in SQL Server 2005?
When I use Management Studio, it generates a script that creates a mirror table without the identity, copies the data, drops the table, then renames the mirror table, etc. This script has 5231 lines in it because this table/column have many FK relations.
I'd feel much more comfortable running a simple alter/drop. Any ideas?
EDIT
I think I'm just going to go with the 5,231 line script from Enterprise Manager. However, I'm going to break it up into smaller parts which I can run and control better. This table "behaves" strange, if you try to delete 1 row (even one you just inserted, which is not in any other FK table), you get this error:
delete MyTable where MyPrimaryKey=1234
Msg 8621, Level 17, State 2, Line 1
The query processor ran out of stack space during query optimization. Please simplify the query.
No doubt, all the FKs. We will halt all access to our application and run in single user mode when we make these schema and related application changes. However, we need this to run fast, and I need an idea of how long it will take. I guess that I'll just have to test, test, test.
If you are on SQL Server 2005 or later, you can do this as a simple metadata change (NB: doesn't require an edition supporting partitioning as I originally stated).
Example code pilfered shamelessly from the workaround by Paul White on this Microsoft Connect Item.
USE tempdb;
GO
-- A table with an identity column
CREATE TABLE dbo.Source
(row_id INTEGER IDENTITY PRIMARY KEY NOT NULL, data SQL_VARIANT NULL);
GO
-- Some sample data
INSERT dbo.Source (data)
VALUES (CONVERT(SQL_VARIANT, 4)),
(CONVERT(SQL_VARIANT, 'X')),
(CONVERT(SQL_VARIANT, {d '2009-11-07'})),
(CONVERT(SQL_VARIANT, N'áéíóú'));
GO
-- Remove the identity property
BEGIN TRY;
-- All or nothing
BEGIN TRANSACTION;
-- A table with the same structure as the one with the identity column,
-- but without the identity property
CREATE TABLE dbo.Destination
(row_id INTEGER PRIMARY KEY NOT NULL, data SQL_VARIANT NULL);
-- Metadata switch
ALTER TABLE dbo.Source SWITCH TO dbo.Destination;
-- Drop the old object, which now contains no data
DROP TABLE dbo.Source;
-- Rename the new object to make it look like the old one
EXECUTE sp_rename N'dbo.Destination', N'Source', 'OBJECT';
-- Success
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
-- Bugger!
IF XACT_STATE() <> 0 ROLLBACK TRANSACTION;
PRINT ERROR_MESSAGE();
END CATCH;
GO
-- Test the the identity property has indeed gone
INSERT dbo.Source (row_id, data)
VALUES (5, CONVERT(SQL_VARIANT, N'This works!'))
SELECT row_id,
data
FROM dbo.Source;
GO
-- Tidy up
DROP TABLE dbo.Source;
I don't believe you can directly drop the IDENTITY part of the column. Your best bet is probably to:
add another non-identity column to the table
copy the identity values to that column
drop the original identity column
rename the new column to replace the original column
If the identity column is part of a key or other constraint, you will need to drop those constraints and re-create them after the above operations are complete.
You could add a column to the table that is not an identity column, copy the data, drop the original column, and rename the new column to the old column and recreate the indexes.
Here is a link that shows an example. Still not a simple alter, but it is certainly better than 5231 lines.