In SQL Server data Tools you have the deployment option "Block incremental deployment if data loss might occur", which I'd wager is a best practice to keep checked.
Lets say we have a table foo, and a column bar which is now redundant - has no dependencies, foreign keys etc etc, and we have already removed references to this column in our data layer and stored procedures as it's simply not used. In other words, we are satisfied that dropping this column will have no adverse effects.
There are a couple of flies in the ointment:
The column has data in it
The database is published to
hundreds of distributed clients, and it could take months for the
change to ripple out to all clients
As the column is populated, publishing will fail unless we change the "Block incremental deployment if data loss might occur" option. This option is at the database level, not table level however, and so due to the distributed nature of the clients, we'd have to turn off the "data loss" option for months before all databases were updated, and turn it back on once all clients have updated (our databases have version numbers set by our build).
You may think we could solve this with a pre-deployment script such as
if exists (select * from information_schema.columns where table_name = 'foo' and column_name = 'bar') BEGIN
alter table foo drop constraint DF_foo_bar
alter table foo drop column bar
END
But again this fails unless we turn the "data loss could occur" option off.
I'm simply interested as to what others have done in this scenario as I'd like to have granularity which doesn't currently seem possible.
So I've been accomplishing this task via the following steps:
1) Since we are going to make table #Foo, make sure to drop that table before moving forward if it exists.
2) In a pre-deployment script: If the column exists, create a temporary table #Foo and select all rows from Foo into #Foo.
3) Remove the column from #Foo
4) Delete all rows in Foo (now there will be no data loss since no data exists)
5) In a post-deployment script: If #Foo exists, select all rows from #Foo into Foo
6) Drop table #Foo
And code:
pre-deployment script
if(Object_ID('TempDB..#Foo') is not null)
begin
drop table #Foo
end
if exists (
select *
from sys.columns
where Name = 'Bar'
and Object_ID = Object_ID('Foo')
)
begin
select * into #Foo
from Foo
alter table #Foo drop column Bar
-- Now that we've made a complete backup of Foo, we can delete all its data
delete Foo
end
post-deployment script
if(Object_ID('TempDB..#Foo') is not null)
begin
insert into Foo
select * from #Foo
drop table #Foo
end
Caveat: Depending on your environment, it might be wiser to depend on versions rather than column & temp table existence in your conditionals
The PreDeployment script doesn't work the way you are hoping to use it because of the order of operations for SSDT:
Schema Comparison
Script generation for schema difference
Execute PreDeployment
Excecute generated script
Execute PostDeployment.
So of course, the schema difference is identified as part of #2 and appropriate SQL is generated to drop the column (including the check to block on data loss), before your manual pre-deployment script can 'get rid of it'.
If you take a look at the script generated behind the scenes to detect (and therefore block) on possible data loss, it checks to see if there are any rows by running something along the lines of this:
IF EXISTS (select top 1 1 from [dbo].[Table]) RAISERROR ('Rows were detected. The schema update is terminating because data loss might occur.', 16, 127)
This means the simple existence of rows will stop the column being dropped. We haven't found any way around this other than manually dealing with the problem outside (and before) the SSDT deployment, using conditional deployment steps based on version numbers.
You mention distributed clients, which implies you have some sort of automated publication/update mechanism. You also mention version numbers as part of the database - could you include in your deploy (before the sqlpackage.exe command I assume you are running) a manual SQL script? This is akin to what we do (ours is in Powershell, but you get the gist):
IF VersionNumber < 2.8
BEGIN
ALTER TABLE X DROP COLUMN Y
END
Disclaimer: in no way is that valid SQL, it's simply pseudo code to imply an idea!
Related
I saw this question quite a many times but I couldn't get the answer that would satisfy me. Basically what people and books say is "Although temporary tables are deleted when they go out of scope, you should explicitly delete them when they are no longer needed to reduce resource requirements on the server".
It is quite clear to me that when you are working in management studio and creating tables, then until you close your window or disconnect, you will use some resources for that table and it is logically that it is better to drop them.
But when you work with procedure then if you would like to cleanup tables most probably you will do that at the really end of it (I am not talking about the situation when you drop the table as soon as you really do not need that in the procedure). So the workflow is something like that :
When you drop in SP:
Start of SP execution
Doing some stuff
Drop tables
End of execution
And as far as I understand how can it possibly work when you do not drop:
Start of SP execution
Doing some stuff
End of execution
Drop tables
What's the difference here? I can only imagine that some resources are needed to identify the temporary tables. Any other thoughts?
UPDATE:
I ran simple test with 2 SP:
create procedure test as
begin
create table #temp (a int)
insert into #temp values (1);
drop table #temp;
end
and another one without drop statements. I've enabled user statistics and ran the tests:
declare #i int = 0;
while #i < 10000
begin
exec test;
SET #i= #i + 1;
end
That's what I've got (Trial 1-3 dropping table in SP, 4-6 do not dropping)
As the picture shows that all stats are the same or decreased a bit when I do not drop temporary table.
UPDATE2:
I ran this test 2nd time but now with 100k calls and also added SET NOCOUNT ON. These are the results:
As the 2nd run confirmed that if you do not drop the table in SP then you actually save some user time as this is done by some other internal process but outside of the user time.
You can read more about in in this Paul White's article: Temporary Tables in Stored Procedures
CREATE and DROP, Don’t
I will talk about this in much more detail in my next post, but the
key point is that CREATE TABLE and DROP TABLE do not create and drop
temporary tables in a stored procedure, if the temporary object can be
cached. The temporary object is renamed to an internal form when DROP
TABLE is executed, and renamed back to the same user-visible name when
CREATE TABLE is encountered on the next execution. In addition, any
statistics that were auto-created on the temporary table are also
cached. This means that statistics from a previous execution remain
when the procedure is next called.
Technically, a locally scoped temp table (one with a single hashtag before it) will automatically drop out of scope after your SPID is closed. There are some very odd cases where you get a temp table definition cached somewhere and then no real way to remove it. Usually that happens when you have a stored procedure call which is nested and contains a temp table by the same name.
It's good habit to get into dropping your tables when you're done with them but unless something unexpected happens, they should be de-scoped anyway once the proc finishes.
In MS SQL Server, I'm using a global temp table to store session related information passed by the client and then I use that information inside triggers.
Since the same global temp table can be used in different sessions and it may or may not exist when I want to write into it (depending on whether all the previous sessions which used it before are closed), I'm doing a check for the global temp table existence based on which I create before I write into it.
IF OBJECT_ID('tempdb..##VTT_CONTEXT_INFO_USER_TASK') IS NULL
CREATE TABLE ##VTT_CONTEXT_INFO_USER_TASK (
session_id smallint,
login_time datetime,
HstryUserName VDT_USERNAME,
HstryTaskName VDT_TASKNAME,
)
MERGE ##VTT_CONTEXT_INFO_USER_TASK As target
USING (SELECT ##SPID, #HstryUserName, #HstryTaskName) as source (session_id, HstryUserName, HstryTaskName)
ON (target.session_id = source.session_id)
WHEN MATCHED THEN
UPDATE SET HstryUserName = source.HstryUserName, HstryTaskName = source.HstryTaskName
WHEN NOT MATCHED THEN
INSERT VALUES (##SPID, #LoginTime, source.HstryUserName, source.HstryTaskName);
The problem is that between my check for the table existence and the MERGE statement, SQL Server may drop the temp table if all the sessions which were using it before happen to close in that exact instance (this actually happened in my tests).
Is there a best practice on how to avoid this kind of concurrency issues, that a table is not dropped between the check for its existence and its subsequent use?
The notion of "global temporary table" and "trigger" just do not click. Tables are permanent data stores, as are their attributes -- including triggers. Temporary tables are dropped when the server is re-started. Why would anyone design a system where a permanent block of code (trigger) depends on a temporary shared storage mechanism? It seems like a recipe for failure.
Instead of a global temporary table, use a real table. If you like, put a helpful prefix such as temp_ in front of the name. If the table is being shared by databases, then put it in a database where all code has access.
Create the table once and leave it there (deleting the rows is fine) so the trigger code can access it.
I'll start by saying that, on the long term, I will follow Gordon's advice, i.e. I will take the necessary steps to introduce a normal table in the database to store client application information which needs to be accessible in the triggers.
But since this was not really possible now because of time constrains (it takes weeks to get the necessary formal approvals for a new normal table), I came up with a solution for preventing SQL Server from dropping the global temp table between the check for its existence and the MERGE statement.
There is some information out there about when a global temp table is dropped by SQL Server; my personal tests showed that SQL Server drops a global temp table the moment the session which created it is closed and any other transactions started in other sessions which changed data in that table are finished.
My solution was to fake data changes on the global temp table even before I check for its existence. If the table exists at that moment, SQL Server will then know that it needs to keep it until the current transaction finishes, and it cannot be dropped anymore after the check for its existence. The code looks now like this (properly commented, since it is kind of a hack):
-- Faking a delete on the table ensures that SQL Server will keep the table until the end of the transaction
-- Since ##VTT_CONTEXT_INFO_USER_TASK may actually not exist, we need to fake the delete inside TRY .. CATCH
-- FUTURE 2016, Feb 03: A cleaner solution would use a real table instead of a global temp table.
BEGIN TRY
-- Because schema errors are checked during compile, they cannot be caught using TRY, this can be done by wrapping the query in sp_executesql
DECLARE #QueryText NVARCHAR(100) = 'DELETE ##VTT_CONTEXT_INFO_USER_TASK WHERE 0 = 1'
EXEC sp_executesql #QueryText
END TRY
BEGIN CATCH
-- nothing to do here (see comment above)
END CATCH
IF OBJECT_ID('tempdb..##VTT_CONTEXT_INFO_USER_TASK') IS NULL
CREATE TABLE ##VTT_CONTEXT_INFO_USER_TASK (
session_id smallint,
login_time datetime,
HstryUserName VDT_USERNAME,
HstryTaskName VDT_TASKNAME,
)
MERGE ##VTT_CONTEXT_INFO_USER_TASK As target
USING (SELECT ##SPID, #HstryUserName, #HstryTaskName) as source (session_id, HstryUserName, HstryTaskName)
ON (target.session_id = source.session_id)
WHEN MATCHED THEN
UPDATE SET HstryUserName = source.HstryUserName, HstryTaskName = source.HstryTaskName
WHEN NOT MATCHED THEN
INSERT VALUES (##SPID, #LoginTime, source.HstryUserName, source.HstryTaskName);
Although I would call it a "use it at your own risk" solution, it does prevent that the use of the global temp table in other sessions affects its use in the current one, which was the concern that made me start this thread.
Thanks all for your time! (from text formatting edits to replies)
A new column has been added to a table, but the new column was not added to the end of the table definition (rightmost column), but the middle of the table.
When I try to commit this in Redgate SQL Source Control, I get the warning "These changes may result in data loss"
Will data loss really occurr?
Is there a way preview the change script to confirm that no data will be lost?
Can I copy the script and easily turn it into a Migrations V2 script?
Will I just have to
Edit the table in SSMS and move the new column to the end
or write a migration script?
If so, are there any handy tools to do the repetitive stuff?
Up front disclosure that I work for Red Gate on SQL Source Control.
That change will need to re-create a table. By default SSMS won't let you save that change. However that option must have been disabled in SSMS. It's under Tools->Options->Designers->Table and Database Designers->Prevent saving changes that require a table re-creating.
Given that feature is disabled SQL Source Control has then picked that up as a potential data loss situation, and prompted to see if you want to add a migration script.
If other developers within your team pull this change in through a get latest, then SQL Source Control will let them about any potential data loss with more details, depending on the current state of their local database. If the only change is adding columns to an existing table then this will not drop the data in columns that are unchanged.
If you are deploying to another DB (e.g. staging/UAT/prod) and you have SQL Compare you can use that to see exactly what will be applied to a DB if you try and run this against another non-local database. Choose the create deployment script option and you can sanity check the SQL before running.
As you say adding the column to the end of the table will avoid the need for the rebuild, so is probably the simplest way to avoid this if you don't need to worry about where the column is.
Alternatively you can add a migration script to:
Create a new table with the new structure using a temp name
Copy the existing data to the temp table
Drop the existing table
Rename the new temp table to the original name
You mention Migrations v2, the beta feature that changes how migrations work in order to better support branching and merging and DVCS systems. See http://www.red-gate.com/migrations
Version 1 migration scripts will need some modifications in order to be converted to a v2 migration script. It's a fairly trivial change. We're working on documenting this at the moment, and please reach out to us on the Google Group if you'd like more information on this change. https://groups.google.com/forum/#!forum/red-gate-migrations
I moved the column to the end of the table using SSMS to negate the need for a migration script.
In a similar scenario, where it was not convenient to move the column, this is what I did to convert an SSMS script to a Migrations V2 script.
Undo the change in SSMS (deleted the column)
Redo the change in SSMS, but instead of saving the change direct to the database, I saved the change script
Modified the change script
Trimmed the SSMS transaction & environment wrapper
Added a guard clause: IF COL_LENGTH('MyTable','MyColumn') IS NULL
Wrapped the script in BEGIN TRAN - ROLLBACK TRAN to test the script without dirtying the database
Replaced GO with END BEGIN
Tested within rolled-back transaction
Removed BEGIN TRAN - ROLLBACK TRAN development wrapper
Here is the simple sql query that will help to insert column in database table without data loss.
Lets say CCDetails is the table in which we want to insert column GlobaleNote just before column Sys_CreatedBy:
declare #str1 nvarchar(1000)
declare #tableName nvarchar(1000)
set #tableName='CCDetails'
set #str1 = ''
SELECT #str1 = #str1 + ', ' + COLUMN_NAME
FROM Information_Schema.Columns
WHERE Table_Name = #tableName
ORDER BY Ordinal_Position
set #str1 = right(#str1, len(#str1) - 2)
set #str1 = 'select ' + #str1 +' into '+#tableName+'Temp from '+#tableName+' ; Drop Table '+ #tableName + ' ; EXEC sp_rename '+#tableName+'Temp, '+#tableName
set #str1 = REPLACE(#str1,'Sys_CreatedBy','CAST('''' as nvarchar(max)) As GlobaleNote , Sys_CreatedBy' )
exec sp_executesql #str1
Me and another developer are discussing which type of table would be more appropriate for our task. It's basically going to be a cache that we're going to truncate at the end of the day. Personally, I don't see any reason to use anything other than a normal table for this, but he wants to use a global temp table.
Are there any advantages to one or the other?
Use a normal table in tempdb if this is just transient data that you can afford to lose on service restart or a user database if the data is not that transient.
tempdb is slightly more efficient in terms of logging requirements.
Global temp tables get dropped once all referencing connections are the connection that created the table is closed.
Edit: Following #cyberkiwi's edit. BOL does definitely explicitly say
Global temporary tables are visible to
any user and any connection after they
are created, and are deleted when all
users that are referencing the table
disconnect from the instance of SQL
Server.
In my test I wasn't able to get this behaviour though either.
Connection 1
CREATE TABLE ##T (i int)
INSERT INTO ##T values (1)
SET CONTEXT_INFO 0x01
Connection 2
INSERT INTO ##T VALUES(4)
WAITFOR DELAY '00:01'
INSERT INTO ##T VALUES(5)
Connection 3
SELECT OBJECT_ID('tempdb..##T')
declare #killspid varchar(10) = (select 'kill ' + cast(spid as varchar(5)) from sysprocesses where context_info=0x01)
exec (#killspid)
SELECT OBJECT_ID('tempdb..##T') /*NULL - But 2 is still
running let alone disconnected!*/
Global temp table
-ve: As soon as the connection that created the table goes out of scope, it takes
the table with it. This is damaging if you use connection pooling which can swap connections constantly and possibly reset it
-ve: You need to keep checking to see if the table already exists (after restart) and create it if not
+ve: Simple logging in tempdb reduces I/O and CPU activity
Normal table
+ve: Normal logging keeps your cache with your main db. If your "cache" is maintained but is still mission critical, this keeps it consistent together with the db
-ve: follow from above More logging
+ve: The table is always around, and for all connections
If the cache is a something like a quick lookup summary for business/critical data, even if it is reset/truncated at the end of the day, I would prefer to keep it a normal table in the db proper.
How can I perform this query on whatever way:
delete from sys.tables where is_ms_shipped = 0
What happened is, I executed a very large query and I forgot to put USE directive on top of it, now I got a zillion tables on my master db, and don't want to delete them one by one.
UPDATE: It's a brand new database, so I don't have to care about any previous data, the final result I want to achieve is to reset the master db to factory shipping.
If this is a one-time issue, use SQL Server Management Studio to delete the tables.
If you must run a script very, very carefully use this:
EXEC sp_msforeachtable 'DROP TABLE ?'
One method I've used in the past which is pretty simple and relatively foolproof is to query the system tables / info schema (depending on exact requirements) and have it output the list of commands I want to execute as the results set. Review that, copy & paste, run - quick & easy for a one-time job and because you're still manually hitting the button on the destructive bit, it's (IMHO) harder to trash stuff by mistake.
For example:
select 'drop table ' + name + ';', * from sys.tables where is_ms_shipped = 0
No backups? :-)
One approach may be to create a Database Project in Visual Studio with an initial Database Import. Then delete the tables and synchronize the project back to the database. You can do the deletes en masse with this approach while being "buffered" with a commit phase and UI.
I am fairly certain the above approach can be used to take care of the table relationships as well (although I have not tried in the "master" space). I would also recommend using a VS DB project (or other database management tool that allows schema comparing and synchronization) to make life easier in the future as well as allowing version-able (e.g. with SCM) schema change-tracking.
Oh, and whatever is done, please create a backup first. If nothing else, it is good training :-)
Simplest and shortest way I did was this:
How to Rebuild System Databases in SQL Server 2008
The problem with all other answers here is that it doesn't work, since there are related tables and it refuses to execute.
This one, not only it works but actually is what I am looking for: "Reset to factory defaults" as stated in the question.
Also this one will delete everything, not only tables.
This code could be better but I was trying to be cautious as I wrote it. I think it is easy to follow an easy to tweak for testing before you commit to deleting your tables.
DECLARE
#Prefix VARCHAR(50),
#TableName NVARCHAR(255),
#SQLToFire NVARCHAR(350)
SET #Prefix = 'upgrade_%'
WHILE EXISTS(
SELECT
name
FROM
sys.tables
WHERE
name like #Prefix
)
BEGIN
SELECT
TOP 1 --This query only iterates if you are dropping tables
#TableName = name
FROM
sys.tables
WHERE
name like #Prefix
SET #SQLToFire = 'DROP TABLE ' + #TableName
EXEC sp_executesql #SQLToFire;
END
I did something really similar, and what I wound up doing was using the Tasks--> script database to only script drops for all the database objects of the originally intended database. Meaning the database I was supposed to run the giant script on, which I did run it on. Be sure to include IF Exists in the advanced options, then run that script against the master and BAM, deletes everything that exists in the original target database that also exists in the master, leaving the differences, which should be the original master items.
Not very elegant but as this is a one time task.
WHILE EXISTS(SELECT * FROM sys.tables where is_ms_shipped = 0)
EXEC sp_MSforeachtable 'DROP TABLE ?'
Works fine on this simple test (clearing a on the second loop after failing on the first attempt and proceeding onwards to delete b)
create table a
(
a int primary key
)
go
create table b
(
a int references a (a)
)
insert into a values (1)
insert into b values (1)