Use transactions for select statements?

Use transactions for select statements? - sql-server

I don't use Stored procedures very often and was wondering if it made sense to wrap my select queries in a transaction.
My procedure has three simple select queries, two of which use the returned value of the first.

In a highly concurrent application it could (theoretically) happen that data you've read in the first select is modified before the other selects are executed.
If that is a situation that could occur in your application you should use a transaction to wrap your selects. Make sure you pick the correct isolation level though, not all transaction types guarantee consistent reads.
Update :
You may also find this article on concurrent update/insert solutions (aka upsert) interesting. It puts several common methods of upsert to the test to see what method actually guarantees data is not modified between a select and the next statement. The results are, well, shocking I'd say.

Transactions are usually used when you have CREATE, UPDATE or DELETE statements and you want to have the atomic behavior, that is, Either commit everything or commit nothing.
However, you could use a transaction for READ select statements to:
Make sure nobody else could update the table of interest while the bunch of your select query is executing.
Have a look at this msdn post.

Most databases run every single query in a transaction even if not specified it is implicitly wrapped. This includes select statements.
PostgreSQL actually treats every SQL statement as being executed within a transaction. If you do not issue a BEGIN command, then each individual statement has an implicit BEGIN and (if successful) COMMIT wrapped around it. A group of statements surrounded by BEGIN and COMMIT is sometimes called a transaction block.
https://www.postgresql.org/docs/current/tutorial-transactions.html

Related

Conditional SQL block evaluated even when it won't be executed

I'm working on writing a migration script for a database, and am hoping to make it idempotent, so we can safely run it any number of times without fear of it altering the database (/ migrating data) beyond the first attempt.
Part of this migration involves removing columns from a table, but inserting that data into another table first. To do so, I have something along these lines.
IF EXISTS
(SELECT * FROM sys.columns
WHERE object_id = OBJECT_ID('TableToBeModified')
AND name = 'ColumnToBeDropped')
BEGIN
CREATE TABLE MigrationTable (
Id int,
ColumnToBeDropped varchar
);
INSERT INTO MigrationTable
(Id, ColumnToBeDropped)
SELECT Id, ColumnToBeDropped
FROM TableToBeModified;
END
The first time through, this works fine, since it still exists. However, on subsequent attempts, it fails because the column no longer exists. I understand that the entire script is evaluated, and I could instead put the inner contents into an EXEC statement, but is that really the best solution to this problem, or is there another, still potentially "validity enforced" option?

I understand that the entire script is evaluated, and I could instead put the inner contents into an EXEC statement, but is that really the best solution to this problem
Yes. There are several scenarios in which you would want to push off the parsing validation due to dependencies elsewhere in the script. I will even sometimes put things into an EXEC, even if there are no current problems, to ensure that there won't be as either the rest of the script changes or the environment due to addition changes made after the current rollout script was developed. Minorly, it helps break things up visually.
While there can be permissions issues related to breaking ownership changing due to using Dynamic SQL, that is rarely a concern for a rollout script, and not a problem I have ever run into.

If we are not sure that the script will work or not specially migrating database.
However, For query to updated data related change, i will execute script with BEGIN TRAN and check result is expected then we need to perform COMMIT TRAN otherwise ROLLBACK transaction, so it will discard transaction.

Can we replace a DML trigger with a stored procedure

Not sure if this has been asked before cause while typing the title text the possible duplicate given suggestion's doesn't match.
One of my colleague asked if a DML trigger functioning can be replaced totally with a stored procedure(SP). Well sounds bit weird at first but it's possible cause trigger is also a special type of SP but not explicitly callable.
I mean say for example: a AFTER INSERT Trigger named trg_insert1 defined on tbl1 which does update some data in in tbl2 like below (taken a SQL Server Example but question is not specific to any DB)
create trigger trg_insert1
after insert on tbl1
foreach row
begin
update tbl2 set somedata = inserted.tbl1somedata
where id = inserted.tbl1id;
end
Now this trigger can be replaced with a SP like below (using transaction block);
create procedure usp_insertupdate (#name varchar(10), #data varchar(200))
as
begin
begin try
begin trans
insert into tbl1(name, data) values(#name, #data);
update tbl2 set somedata = #data
where id = scope_identity();
commit trans
end try
begin catch
if ##TRANCOUNT > 0
rollback trans
end catch
end
Which will work perfectly in almost all cases of DML trigger like after/before -> insert/delete/update. BUT I really couldn't answer/explain
what the difference then?
Is it a good practice to do so?
Is it not possible in all cases?
Am I being thinking it over complex.
Please let me know what you think.
[NOTE: Not a specific RDBMS related question though]

I'll try to answer in a very general sense (you specified this is not targeted to a specific implementation).
First of all, a trigger is written in the same data manipulation language that you would use for a stored procedure. So in terms of capabilities Trigger and Stored Procedure are the same.
But...
a trigger is guaranteed to be invoked every time you alter the data, no matter if you do that through a stored procedure, another trigger, or by manually executing a SQL statement.
In fact you can expect a trigger to always execute (for its triggering statement) unless you explicitly disable it.
A stored procedure, on the other hand it is guaranteed never to run by itself unless you explicitly run it.
This has an important consequence: triggers are better at ensuring consistency. If someone in a hurry removes a record in your live instance by typing:
Delete from tablex where uid="QWTY10311"
any bookkeeping action implemented as a trigger will be executed, while if the user forgets (or maliciously avoid) following this with
Execute SP_TABLEX_LOG("DELETE","QWTY10311")
your DB will just lose the data silently.
Triggers have two other important characteristics that can be duplicated with stored procedures only through extra (sometimes significantly more expensive) effort.
First of all they are executed record-by-record. So if you are deleting 1 million records the logging will be performed for each operation. Good luck calling the appropriate stored procedure with a 1 million rows cursor as a parameter, ESPECIALLY if you want to do that after a manual operation as in my example above.
Second advantage: Triggers have a special scope where they can reference pre- and post- change values for each field.
So if you are incrementing a table of prices by 10% and want to log what the previous value was, and which user performed the action at what time, you will have "old-value", "new-value", "user-id" and "timestamp" in scope for any kind of operation you may want to do.
Again, doing this by invoking a stored procedure means you have to save the values to pass them to the stored procedure when it runs.
So why bother with SP anyway? (this will answer, hopefully, your question about "best use case").
Stored Procedure are better when you need to create complex business logic which will be invoked by an application layer. So if you want to know, for example, how many hotel rooms are available between two given dates and with the extra requirement that pets are allowed, a trigger would not be a good idea.
Especially because a trigger will not return any result to an invoking process...
So anytime you need to get some result to the caller, be it a query, a calculation, or anything else that has OUTPUT parameters, a trigger is useless.
Triggers should be used to enforce consistency. If a header record should not be deleted unless it has no children in other tables, enforce this with a trigger, maybe. If you need to log whoever changes a value in a field, no matter how, use a trigger.
In all other cases, use a stored procedure (keep also in mind that triggers will impact the responsiveness of any data update, just like indexes).

Yes stored procedures can be used to replace DML triggers in this way, and whether it is a good practice or not depends on your needs.
The main difference is that a trigger will run its code every time it is fired. In your example, if a user does an ad-hoc INSERT to tbl1, a trigger will fire and tbl2 will get updated.
A stored procedure can only be used to enforce this rule if ad-hoc INSERTs are not allowed.

Should I avoid using sp_getAppLock?

I have a stored procedure, and I want to ensure it cannot be executed concurrently.
My (multi-threaded) application does all necessary work on the underlying table via this stored procedure.
IMO, locking the table itself is an unnecessarily drastic action to take, and so when I found out about sp_GetAppLock, which essentially enforces a critical section, this sounded ideal.
My plan was to encase the stored procedure in a transaction and to set up spGetAppLock with transaction scope. The code was written and tested successfully.
The code has now been put forward for review and I have been told that I should not call this function. However when asking the obvious question "why not?", the only reasons I am getting are highly subjective, to do with any form of locking being complicated.
I don't necessarily buy this, but I was wondering whether anyone had any objective reasons why I should avoid this construct. Like I say, given my circumstances a critical section sounds an ideal approach to me.
Further info: An application sits on top of this with 2 threads T1 and T2. Each thread is waiting for a different message M1 and M2. The business logic involved says that processing can only happen once both M1 and M2 have arrived. The stored procedure logs that Mx has arrived (insert) and then checks whether My is present (select). The built-in locking is fine to make sure the inserts happen serially. But the selects need to happen serially too and I think I need to do something over and above the built-in functionality here.
Just for clarity, I want the "processing" to happen exactly once. So I can't afford for the stored procedure to return either false positives or false negatives. I'm worried that if the stored proc runs twice in very quick succession, then both "selects" might return data which indicates that it is appropriate to perform processing.

What is the procedure doing that you cannot rely on SQL Servers built-in concurrency control mechanisms? Often queries can be rewritten to allow real concurrency.
But if this procedure indeed has to be executed "alone", locking the table itself on first access is most likely going to be a lot faster than using the call to sp_GetAppLock. It sounds like this procedure is going to be called often. If that is the case you should look for a way to achieve the goal with minimal impact.
If the table contains no other rows besides of M1 and M2 a table lock is still your best bet.
If you have multiple threads sending multiple messages you can get more fine-grained by using "serializable" as transaction level and check if the other message is there before you do the insert but within the same transaction. To prevent deadlocks in this case make sure you check for both messages for example like this:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRAN;
SELECT
#hasM1 = MAX(CASE WHEN msg_type='M1' THEN 1 ELSE 0 END),
#hasM2 = MAX(CASE WHEN msg_type='M2' THEN 1 ELSE 0 END)
FROM messages WITH(UPDLOCK)
WHERE msg_type IN ('M1','M2')
INSERT ...
IF(??) EXEC do_other_stuff_and_delete_messages;
COMMIT
In the IF statement before(!) the COMMIT you can use the information collected before the insert together with the information that you inserted to decide if additional processing is necessary.
In that processing step make sure to either mark those messages as processed or to delete them all still within the same transaction. That will make sure that you will not process those messages twice.
SERIALIZABLE is the only transaction isolation level that allows to lock rows that do not exist yet, so the first select statement with the WITH(UPDLOCK) effectively prevents the other row being inserted while the first execution is still running.
Finally, these are a lot of things to be aware of that could go wrong. You might want to have a look at service broker instead. you could use three queues with that. one for type M1 and one for type M2. Every time a message arrives within those queues a procedure can automatically be called to insert a token into the third queue. The third queue then could activate a process to check if both messages exist and do work. That would make the entire process asynchronous but for that it would be easy to restrict the queue 3 response to always only do one check at a time.
Service broker on msdn, also look at "activation" for the automatic message processing.

sp_GetAppLock is just like many other tools and as such it can be misused, overused, or correctly used. It is an exact match for the type of problem described by the original poster.
This is a good MSSQL Tips post on the usage
Prevent multiple users from running the same SQL Server stored procedure at the same time
http://www.mssqltips.com/sqlservertip/3202/prevent-multiple-users-from-running-the-same-sql-server-stored-procedure-at-the-same-time/

We use sp_getapplock all the time, due to the fact that we support some legacy applications that have been re-worked to use a SQL back-end, and the SQL Server locking model is not an exact match for our application logic.
We tend to go for a 'pessimistic' locking model, where we lock an entity before allowing a user to edit it, and use the (NOLOCK) hint extensively when reading data to bypass any blocking from the native locks on the actual tables. sp_getapplock is a good match for this. We also use it to enforce critical paths in large multi-user systems. You have to be systematic about what you call the locks you place.
We've found no performance problems with large numbers of user/locks via this route, so I see no reason why it wouldn't work well for you. Just be aware that you can get blocking and deadlocks if you have processes that place the same named locks, but not necessarily in the same order.

You can create a table with a flag for each set of messages, so if one of the threads is first to start processing it will mark the flag as processing.
To make sure that record blocked properly once one of threads reaches it use:
SELECT ... FROM WITH(XLOCK,ROWLOCK,READCOMMITTED) ... WHERE ...
This peace of code will put Exclusive lock on the record meaning who first got to it owns the row.
Then you do your changes and update flag, other thread will get updated value because it will be blocked by Exclusive lock until first thread commmits or rollbacks transaction.
For this to work you always need to select records from table with XLOCK this way it will work as expected.
Hope this helps.
Exclusive lock prove:
USE master
GO
IF OBJECT_ID('dbo.tblTest') IS NOT NULL
DROP TABLE dbo.tblTest
CREATE TABLE tblTest ( id int PRIMARY KEY )
;WITH cteNumbers AS (
SELECT 1 N
UNION ALL
SELECT N + 1 FROM cteNumbers WHERE N<1000
)
INSERT INTO
tblTest
SELECT
N
FROM
cteNumbers
OPTION (MAXRECURSION 0)
BEGIN TRANSACTION
SELECT * FROM dbo.tblTest WITH(XLOCK,ROWLOCK,READCOMMITTED) WHERE id = 1
SELECT * FROM sys.dm_tran_locks WHERE resource_database_id = DB_ID('master')
ROLLBACK TRANSACTION

Is it correct to say that data reading operations need not run inside transactions?

Say that a method only reads data from a database and does not write to it. Is it always the case that such methods don't need to run within a transaction?

In many databases a request for reading from the database which is not in an explicit transaction implicitly creates a transaction to run the request.
In a SQL database you may want to use a transaction if you are running multiple SELECT statements and you don't want changes from other transactions to show up in one SELECT but not an earlier one. A transaction running at the SERIALIZABLE transaction isolation level will present a consistent view of the data across multiple statements.

No. If you don't read at a specific isolation level you might not get enough guarantees. For example rows might disappear or new rows might appear.
This is true even for a single statement:
select * from Tab
except select * from Tab
This query can actually return rows in case of concurrent modifications because it scans the table twice.
SQL Server: There is an easy way to get fast, nonblocking, nonlocking, consistent reads: Enable snapshot isolation and read in a snapshot transaction. AFAIK Oracle has this capability as well. Postgres too.

the purpose of transaction is to rollback or commit the operations done to a database, if u are just selecting values and making no change in the data there is no need of transaction.

SQL - update, delete, insert - Whatif scenerio

I was reading an article the other day the showed how to run SQL Update, Insert, or Deletes as a whatif type scenario. I don't remember the parameter that they talked about and now I can't find the article. Not sure if I was dreaming.
Anyway, does anyone know if there is a parameter in SQL2008 that lets you try an insert, update, or delete without actually committing it? It will actually log or show you what it would have updated. You remove the parameter and run it if it behaves as you would expect.

I don't know of a SQL2008 specific feature with any SQL service that supports transactions you can do this:
Start a transaction ("BEGIN TRANSACTION" in TSQL)
The rest of your INSERT/UPDATE/DELETE/what-ever code
(optional) Some extra SELECT statements and such if needed to output the result of the above actions, if the default output from step 2 (things like "X rows affected") is not enough
Rollback the transaction ("ROLLBACK TRANSACTION" in TSQL)
(optional) Repeat the testing code to show how things are without the code in step 2 having run
For example:
BEGIN TRANSACTION
-- make changes
DELETE people WHERE name LIKE 'X%'
DELETE people WHERE name LIKE 'D%'
EXEC some_proc_that_does_more_work
-- check the DB state after the changes
SELECT COUNT(*) FROM people
-- undo
ROLLBACK TRANSACTION
-- confirm the DB state without the changes
SELECT COUNT(*) FROM people
(you might prefer to do the optional "confirm" step before starting the transaction rather than after rolling it back, but I've always done it this way around as it keeps the two likely-to-be-identical sections of code together for easier editing)
If you use something like this rather then something SQL2008 specific the technique should be transferable to other RDBS too (just update the syntax if needed).

OK, finally figured it out. I've confused this with another project I was working on with PowerShell. PowerShell has a "whatif" parameter that can be used to show you what files would be removed before they are removed.
My apologies to those who have spent time trying to find an answer to this port and my thanks to those of you who have responsed.

I believe you're talking about BEGIN TRANSACTION
BEGIN TRANSACTION starts a local transaction for the connection issuing the statement. Depending on the current transaction isolation level settings, many resources acquired to support the Transact-SQL statements issued by the connection are locked by the transaction until it is completed with either a COMMIT TRANSACTION or ROLLBACK TRANSACTION statement. Transactions left outstanding for long periods of time can prevent other users from accessing these locked resources, and also can prevent log truncation.

Do you perhaps mean SET NOEXEC ON ?
When SET NOEXEC is ON, SQL Server
compiles each batch of Transact-SQL
statements but does not execute them.
When SET NOEXEC is OFF, all batches
are executed after compilation.
Note that this won't warn/indicate things like key violations.

Toad for SQL Server has a "Validate SQL" feature that checks queries against wrong table/column names etc. . Maybe you are talking about some new feature in SSMS 2008 similar to that...

I'm more than seven years late to this particular party but I suspect the feature in question may also have been the OUTPUT clause. Certainly, it can be used to implement whatif functionality similar to Powershell's in a t-sql stored procedure.
https://learn.microsoft.com/en-us/sql/t-sql/queries/output-clause-transact-sql
Use this in each insert/update/delete/merge query to let the SP output a meaningful resultset of the changes it makes e.g. outputting the table name and action performed as the first two columns then all the altered columns.
Then simply rollback the changes if a #whatif parameter is set to 1 or commit them if #whatif is set to 0.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight