I am trying to write some integration tests for some SQL server stored procedures and functions. I'd like to have a database that has a known set of test data in it, and then wrap each test in a transaction, rolling it back when complete so that tests are effectively independent.
The stored procedures/functions do anything from fairly simple join queries, to complex filtering with many layers of joins, to insertion of data to multiple tables.
There are a couple stored procedures that actually use transactions - and so these are harder to test. I'll show an example of the overall code that is executed, but keep in mind this would normally be in two different spots (test setup/teardown, and the actual stored procedure). For this sample, I'm also using a very simple temp table:
CREATE TABLE #test (
val nvarchar(500)
)
Example:
-- for this example, just ensuring that the table is empty
delete from #test
go
-- begin of test setup code --
begin transaction
go
-- end of test setup code --
-- begin of code under test --
insert into #test values('aaaa')
begin transaction
go
insert into #test values('bbbbb')
rollback transaction
go
insert into #test values('ccccc')
-- Example select #1:
select * from #test
-- end of code under test --
-- begin of test teardown --
rollback transaction
go
-- end of test teardown
-- checking that #temp is still empty, like it was before test
-- Example select #2:
select * from #test
The problem here is that at "Example select #1", I would expect "aaaa" and "cccc" to be in the table, but actually only "cccc" is in the table, as SQL Server actually rolls back ALL transactions (see http://abdulaleemkhan.blogspot.com/2006/07/nested-t-sql-transactions.html). In addition, the second rollback causes an error, and although this can be avoided with:
-- begin of test teardown --
if ##trancount > 0
begin
rollback transaction
end
go
-- end of test teardown
it doesn't solve the real problem: at "Example select #2", we still get "cccc" in the table -- it no longer gets rolled back because there is no transaction active.
Is there a way around this? Is there a better strategy for this type of testing?
Note: I'm not sure if the codebase ever does do anything after rollback or not (the insert 'cccc' part) -- but if it ever does, either intentionally or accidentally, it would be possible for tests to break in strange ways as unexpected data can be left over from another test.
Somewhat similar to Nested stored procedures containing TRY CATCH ROLLBACK pattern? but there is no real solution to the problem posed here.
the rollbacks in your code don't nest. they rollback everything back to the first BEGIN TRANSACTION.
for every BEGIN TRANSACTION, ##trancount gets incremented by one, however, any ROLLBACK sets ##trancount back to zero.
if you want to rollback a portion of a transaction, you need to use TRANSACTION savepoints. you can look them up in BOL, for more info than I can enter here.
http://msdn.microsoft.com/en-us/library/ms188378.aspx
I'd like to have a database that has
a known set of test data in it, and
then wrap each test in a transaction,
rolling it back when complete so that
tests are effectively independent.
Don't. For one, you won't actually test the functionality, because in the real world the procedures will commit. Second, and this is far more important, you will get a gazilion false failures and need to implement workarounds to read dirty data because you don't actually commit, and you cannot do any proper verification.
Instead have a database backup with the well know set, then quickly restore it before the test. Group tests into suites that can all run on a fresh database restore without affecting each other, so you reduce the number of restores needed.
You can also use database snapshots, take a snapshot a the suite startup, then restore the database from the snapshot before each test, see How to: Revert a Database to a Database Snapshot (Transact-SQL).
Or combine the two methods: suite setup (ie. unit test #class method) restores the database from .bak file and creates a snapshot, then each test restores the database from the snapshot.
I had similar issues with that kind of setup and my take on that was to create a "SetupTest" script and a "ClearTest" script to run before and after test execution. Unless you're talking about a huge volume of data here - which would make the test execution too slow, this should work well and make the tests repeatable, because you know that everytime you're running that test suite, you will have the correct data awaiting to be executed.
Related
For so long, I've omitted using SQL Transactions, mostly out of ignorance.
But let's say I have a procedure like this:
CREATE PROCEDURE CreatePerson
AS
BEGIN
declare #NewPerson INT
INSERT INTO PersonTable ( Columns... ) VALUES ( #Parameters... )
SET #NewPerson = SCOPE_IDENTITY()
INSERT INTO AnotherTable ( #PersonID, CreatedOn ) VALUES ( #NewPerson, getdate() )
END
GO
In the above example, the second insert depends on the first, as in it will fail if the first one fails.
Secondly, and for whatever reason, transactions are confusing me as far as proper implementation. I see one example here, another there, and I just opened up adventureworks to find another example with try, catch, rollback, etc.
I'm not logging errors. Should I use a transaction here? Is it worth it?
If so, how should it be properly implemented? Based on the examples I've seen:
CREATE PROCEURE CreatePerson
AS
BEGIN TRANSACTION
....
COMMIT TRANSACTION
GO
Or:
CREATE PROCEDURE CreatePerson
AS
BEGIN
BEGIN TRANSACTION
COMMIT TRANSACTION
END
GO
Or:
CREATE PROCEDURE CreatePerson
AS
BEGIN
BEGIN TRY
BEGIN TRANSACTION
...
COMMIT TRANSACTION
END TRY
BEGIN CATCH
IF ##TRANCOUNT > 0
BEGIN
ROLLBACK TRANSACTION
END
END CATCH
END
Lastly, in my real code, I have more like 5 separate inserts all based on the newly generated ID for person. If you were me, what would you do? This question is perhaps redundant or a duplicate, but for whatever reason I can't seem to reconcile in my mind the best way to handle this.
Another area of confusion is the rollback. If a transaction must be committed as a single unit of operation, what happens if you don't use the rollback? Or is the rollback needed only in a Try/Catch similar to vb.net/c# error handling?
You are probably missing the point of this: transactions are suppose to make a set of separate actions into one, so if one fails, you can rollback and your database will stay as if nothing happened.
This is easier to see if, let's say, you are saving the details of a purchase in a store. You save the data of the customer (like Name or Address), but somehow in between, you missed the details (server crash). So now you know that John Doe bought something, but you don't know what. You Data Integrity is at stake.
Your third sample code is correct if you want to handle transactions in the SP. To return an error, you can try:
RETURN ##ERROR
After the ROLLBACK. Also, please review about:
set xact_abort on
as in: SQL Server - transactions roll back on error?
If the first insert succeeds and the second fails you will have a database in a bad state because SQL Server cannot read your mind. It will leave the first insert (change) in the database even though you probably wanted it all tosucceed or all fail.
To ensure this you should wrap all the statements in begin transaction as you illustrated in the last example. Its important to have a catch so any half completed transaction are explicitly rolled back and the resources (used by the transaction) released as soon as possible.
I've got a simple SQL command that is supposed to read all of the records in from a table and then delete them all. Because there's a chance someone else could be writing to this table at the exact moment, I want to lock the table so that I'm sure that everything I delete is also everything I read.
BEGIN TRAN T1;
SELECT LotID FROM fsScannerIOInvalidCachedLots WITH (TABLOCK, HOLDLOCK);
DELETE FROM fsInvalidCachedLots;
COMMIT TRAN T1;
The really strange thing is, this USED to work. It worked for a while through testing, but now I guess something has changed because it's reading everything in, but it's not deleting any of the records. Consequently, SQL Server is spinning up high CPU usage because when this runs the next time it takes significantly longer to execute, which I assume has something to do with the lock.
Any idea what could be going on here? I've tried both TABLOCK and TABLOCKX
Update: Oh yea, something I forgot to mention, I can't query that table until after the next read the program does. What I mean is, after that statement is executed in code (and the command and connection are disposed of) if I try to query that table from within Management Studio it just hangs, which I assume means it's still locked. But then if I step through the calling program until I hit the next database connection, the moment after the first read the table is no longer locked.
Your SELECT retrieves data from a table named fsScannerIOInvalidCachedLots, but the delete is from a different table named fsInvalidCachedLots.
If you run this query in set xact_abort off, the transaction will not be aborted by the error from the invalid table name. In fact, select ##trancount will show you that there is an active transaction, and select xact_state() will return 1 meaning that it is active and no error has occurred.
On the other hand, with set xact_abort on, the transaction is aborted. select ##trancount will return 0, and select xact_state() will return 0 (no active transaction).
See ##trancount and xact_state() on MSDN for more information about them.
In a script used for interactive analysis of subsets of data, it is often useful to store the results of queries into temporary tables for further analysis.
Many of my analysis scripts contain this structure:
CREATE TABLE #Results (
a INT NOT NULL,
b INT NOT NULL,
c INT NOT NULL
);
INSERT INTO #Results (a, b, c)
SELECT a, b, c
FROM ...
SELECT *
FROM #Results;
In SQL Server, temporary tables are connection-scoped, so the query results persist after the initial query execution. When the subset of data I want to analyze is expensive to calculate, I use this method instead of using a table variable because the subset persists across different batches of queries.
The setup part of the script is run once, and following queries (SELECT * FROM #Results is a placeholder here) are run as often as necessary.
Occasionally, I want to refresh the subset of data in the temporary table, so I run the entire script again. One way to do this would be to create a new connection by copying the script to a new query window in Management Studio, I find this difficult to manage.
Instead, my usual workaround is to precede the create statement with a conditional drop statement like this:
IF OBJECT_ID(N'tempdb.dbo.#Results', 'U') IS NOT NULL
BEGIN
DROP TABLE #Results;
END;
This statement correctly handles two situations:
On the first run when the table does not exist: do nothing.
On subsequent runs when the table does exist: drop the table.
Production scripts written by me would always use this method because it raises no errors for in the two expected situations.
Some equivalent scripts written by my fellow developers sometimes handle these two situations using exception handling:
BEGIN TRY DROP TABLE #Results END TRY BEGIN CATCH END CATCH
I believe in the database world it is better always to ask permission than seek forgiveness, so this method makes me uneasy.
The second method swallows an error while taking no action to handle non-exceptional behavior (table does not exist). Also, it is possible that an error would be raised for a reason other than that the table does not exist.
The Wise Owl warns about the same thing:
Of the two methods, the [OBJECT_ID method] is more difficult to understand but
probably better: with the [BEGIN TRY method], you run the risk of trapping
the wrong error!
But it does not explain what the practical risks are.
In practice, the BEGIN TRY method has never caused problems in systems I maintain, so I'm happy for it to stay there.
What possible dangers are there in managing temporary table existence using BEGIN TRY method? What unexpected errors are likely to be concealed by the empty catch block?
What possible dangers? What unexpected errors are likely to be concealed?
If try catch block is inside a transaction, it will cause a failure.
BEGIN
BEGIN TRANSACTION t1;
SELECT 1
BEGIN TRY DROP TABLE #Results END TRY BEGIN CATCH END CATCH
COMMIT TRANSACTION t1;
END
This batch will fail with an error like this:
Msg 3930, Level 16, State 1, Line 7
The current transaction cannot be committed and cannot support operations that write to the log file. Roll back the transaction.
Msg 3998, Level 16, State 1, Line 1
Uncommittable transaction is detected at the end of the batch. The transaction is rolled back.
Books Online documents this behavior:
Uncommittable Transactions and XACT_STATE
If an error generated in a TRY block causes the state of the current transaction to be invalidated, the transaction is classified as an uncommittable transaction. An error that ordinarily ends a transaction outside a TRY block causes a transaction to enter an uncommittable state when the error occurs inside a TRY block. An uncommittable transaction can only perform read operations or a ROLLBACK TRANSACTION. The transaction cannot execute any Transact-SQL statements that would generate a write operation or a COMMIT TRANSACTION.
now replace TRY/Catch with the Test Method
BEGIN
BEGIN TRANSACTION t1;
SELECT 1
IF OBJECT_ID(N'tempdb.dbo.#Results', 'U') IS NOT NULL
BEGIN
DROP TABLE #Results;
END;
COMMIT TRANSACTION t1;
END
and run again.Transaction will commit without any error.
A better solution may be to use a table variable rather than a temporary table
ie:
declare #results table(
a INT NOT NULL,
b INT NOT NULL,
c INT NOT NULL
);
I also think that a try block is dangerous because can hide an unexpected problem. Some programing languages can catch only selected errors and don't catch unexpected ones, if your programing language has this functionality then use it (T-SQL can't catch for an specific error)
For your scenario, I can explain that I codify exactly like you, with this try catch block.
The desirable behavior would be:
begin try
drop table #my_temp_table
end try
begin catch __table_dont_exists_error__
end catch
But this don't exists! Then you can write some think like:
begin try
drop table #my_temp_table
end try
begin catch
declare #err_n int, #err_d varchar(MAX)
SELECT
#err_n = ERROR_NUMBER() ,
#err_d = ERROR_MESSAGE() ;
IF #err_n <> 3701
raiserror( #err_d, 16, 1 )
end catch
This will raise an event when error deleting table is different that 'table don't exists'.
Notice that for your issue all this code not worth it. But can be useful for other approach. For your problem, elegant solution is drop table only if exists or use table variable.
Not in you question but possibly overlooked is the resources used by the temp table. I always drop the table at the end of the script so it does not tie up resources. What if you put a million rows in the table? Then I also test for the table at the start of the script to handle the condition there was an error in the last run and the table was not dropped. If you want to reuse the temp then at least clear out the rows.
A table variable is another option. It is lighter weight and has limitations. Avoid a table variable if you are going to use it in a query join as the query optimizer does not handle a table variable was well as it does a temp.
SQL documentation:
If more than one temporary table is created inside a single stored procedure or batch, they must have different names.
If a local temporary table is created in a stored procedure or application that can be executed at the same time by several users, the Database Engine must be able to distinguish the tables created by the different users. The Database Engine does this by internally appending a numeric suffix to each local temporary table name. The full name of a temporary table as stored in the sysobjects table in tempdb is made up of the table name specified in the CREATE TABLE statement and the system-generated numeric suffix. To allow for the suffix, table_name specified for a local temporary name cannot exceed 116 characters.
Temporary tables are automatically dropped when they go out of scope, unless explicitly dropped by using DROP TABLE:
A local temporary table created in a stored procedure is dropped automatically when the stored procedure is finished. The table can be referenced by any nested stored procedures executed by the stored procedure that created the table. The table cannot be referenced by the process that called the stored procedure that created the table.
All other local temporary tables are dropped automatically at the end of the current session.
Global temporary tables are automatically dropped when the session that created the table ends and all other tasks have stopped referencing them. The association between a task and a table is maintained only for the life of a single Transact-SQL statement. This means that a global temporary table is dropped at the completion of the last Transact-SQL statement that was actively referencing the table when the creating session ended.
I am dubbing some SP at work and I have discover that whoever wrote the code used a transaction on a single update statement like this
begin transaction
*single update statment:* update table whatever with whatever
commit transaction
I understand that this is wrong because transaction is used when you want to update multiple updates.
I want to understand from the theoretical point, what are the implications of using the code as above?
Is there any difference in updating the whatever table with and without the transaction? Are there any extra locks or something?
Perhaps the transaction was included due to prior or possible future code which may involve other data. Perhaps that developer simply makes a habit of wrapping code in transactions, to be 'safe'?
But if the statement literally involves only a single update to a single row, there really is no benefit to that code being there in this case. A transaction does not necessarily 'lock' anything, though the actions performed inside it may, of course. It just makes sure that all the actions contained therein are performed all-or-nothing.
Note that a transaction is not about multiple tables, it's about multiple updates. It assures multiple updates happen all-or-none.
So if you were updating the same table twice, there would be a difference with or without the transaction. But your example shows only a single update statement, presumably updating only a single record.
In fact, it's probably pretty common that transactions encapsulate multiple updates to the same table. Imagine the following:
INSERT INTO Transactions (AccountNum, Amount) VALUES (1, 200)
INSERT INTO Transactions (AccountNum, Amount) values (2, -200)
That should be wrapped into a transaction, to assure that the money is transferred correctly. If one fails, so must the other.
I understand that this is wrong because transaction is used when you want to update multiple tables.
Not necessarily. This involves one table only - and just 2 rows:
--- transaction begin
BEGIN TRANSACTION ;
UPDATE tableX
SET Balance = Balance + 100
WHERE id = 42 ;
UPDATE tableX
SET Balance = Balance - 100
WHERE id = 73 ;
COMMIT TRANSACTION ;
--- transaction end
Hopefully your colleague's code looks more like this, otherwise SQL will issue a syntax error.
As per Ypercube's comment, there is no real purpose in placing one statement inside a transaction, but possibly this is a coding standard or similar.
begin transaction -- Increases ##TRANCOUNT to 1
update table whatever with whatever
commit transaction -- DECREMENTS ##TRANCOUNT to 0
Often, when issuing adhoc statements directly against SQL, it is a good idea to wrap your statements in a transaction, just in case something goes wrong and you need to rollback, i.e.
begin transaction -- Just in case my query goofs up
update table whatever with whatever
select ... from table ... -- check that the correct updates / deletes / inserts happened
-- commit transaction -- Only commit if the above check succeeds.
I'm trying to write to a log file inside a transaction so that the log survives even if the transaction is rolled back.
--start code
begin tran
insert [something] into dbo.logtable
[[main code here]]
rollback
commit
-- end code
You could say just do the log before the transaction starts but that is not as easy because the transaction starts before this S-Proc is run (i.e. the code is part of a bigger transaction)
So, in short, is there a way to write a special statement inside a transaction that is not part of the transaction. I hope my question makes sense.
Use a table variable (#temp) to hold the log info. Table variables survive a transaction rollback.
See this article.
I do this one of two ways, depending on my needs at the time. Both involve using a variable, which retain their value following a rollback.
1) Create a DECLARE #Log varchar(max) value and use this: #SET #Log=ISNULL(#Log+'; ','')+'Your new log info here'. Keep appending to this as you go through the transaction. I'll insert this into the log after the commit or the rollback as necessary. I'll usually only insert the #Log value into the real log table when there is an error (in theCATCH` block) or If I'm trying to debug a problem.
2) create a DECLARE #LogTable table (RowID int identity(1,1) primary key, RowValue varchar(5000). I insert into this as you progress through your transaction. I like using the OUTPUT clause to insert the actual IDs (and other columns with messages, like 'DELETE item 1234') of rows used in the transaction into this table with. I will insert this table into the actual log table after the commit or the rollback as necessary.
If the parent transaction rolls back the logging data will roll back as well - SQL server does not support proper nested transactions. One possibility is to use a CLR stored procedure to do the logging. This can open its own connection to the database outside the transaction and enter and commit the log data.
Log output to a table, use a time delay, and use WITH(NOLOCK) to see it.
It looks like #arvid wanted to debug the operation of the stored procedure, and is able to alter the stored proc.
The c# code starts a transaction, then calls a s-proc, and at the end it commits or rolls back the transaction. I only have easy access to the s-proc
I had a similar situation. So I modified the stored procedure to log my desired output to a table. Then I put a time delay at the end of the stored procedure
WAITFOR DELAY '00:00:12'; -- 12 second delay, adjust as desired
and in another SSMS window, quickly read the table with READ UNCOMMITTED isolation level (the "WITH(NOLOCK)" below
SELECT * FROM dbo.NicksLogTable WITH(NOLOCK);
It's not the solution you want if you need a permanent record of the logs (edit: including where transactions get rolled back), but it suits my purpose to be able to debug the code in a temporary fashion, especially when linked servers, xp_cmdshell, and creating file tables are all disabled :-(
Apologies for bumping a 12-year old thread, but Microsoft deserves an equal caning for not implementing nested transactions or autonomous transactions in that time period.
If you want to emulate nested transaction behaviour you can use named transactions:
begin transaction a
create table #a (i int)
select * from #a
save transaction b
create table #b (i int)
select * from #a
select * from #b
rollback transaction b
select * from #a
rollback transaction a
In SQL Server if you want a ‘sub-transaction’ you should use save transaction xxxx which works like an oracle checkpoint.