Looking to reduce blocking and I see the following a lot when running the blocking script from Pinal Dave.
Example with dummy table names
BLOCKING_TREE
HEAD - 62 drop table if exists live.dbo.connection select * into live.dbo.connection from [dbo].connection_temp
| |------ 137 SELECT tr.name AS [Name], tr.object_id AS [ID] FROM sys.triggers AS tr WHERE (tr.parent_class = 0) ORDER BY [Name] ASC
Could this be due to the trigger checks done on the DROP statement on SPID 62 still being held during the INSERT, thus blocking SPID 137 which is looking to return TR info?
If it is, would adding a semicolon between the DROP and INSERT free up that lock?
As this happens for many scripts but it can be ages before I see the exact same SP being the head of a blocking tree, I am looking for advice on whether my train of though is correct or am I potentially wasting time peppering semicolons all over hundreds of legacy/inherited SPs trying to reduce blocking.
Short answer - no.
Longer answer - Semicolons are currently (for the most part) optional in T-SQL. If the presence or absence of them affected blocking behavior, it would suggest that the locks taken (or their scope) would be different is affected. That is in the realm of "extraordinary claims require extraordinary evidence".
That said, if you're dropping/creating a live table (I see drop table if exists..., select * into...) in the context of a trigger, I'd strongly expect that to have concurrency issues. I'd recommend either switching to temporary tables or stop dropping/creating the table and instead insert the data from what appears to be a staging table into your live table. But without seeing the trigger definition, it's hard to know for sure.
Related
I have 1.2 million rows in Azure data table. The following command:
DELETE FROM _PPL_DETAIL WHERE RunId <> 229
is painfully slow.
There is an index on RunId.
I am deleting most of the data.
229 is a small number of records.
It has been running for an hour now
Should it take this long?
I am pretty sure it will finish.
Is there anything I can do to make operations like this faster?
The database does have a PK, although it is a dummy PK (not used). I already saw that as an optimization need to help this problem, but it still takes way too long (SQL Server treats a table without a PK differently -- much less efficient). It is still taking 1+ hour.
How about trying something like below
BEGIN TRAN
SELECT * INTO #T FROM _PPL_DETAIL WHERE RunId = 229
TRUNCATE TABLE _PPL_DETAIL
INSERT INTO _PPL_DETAIL
SELECT * FROM #T
COMMIT TRAN
Without knowing what database tier is using the database where that statment runs it is not easy to help you. However, let us tell you how the system works so that you can make this determination with a bit more investigation by yourself.
Currently the log commit rate is limited by the tier the database has. Deletes are fundamentally limited on the ability to write out log records (and replicate them to multiple machines in case your main machine dies). When you select records, you don't have to go over the network to N machines and you may not even need to go to the local disk if the records are preserved in memory, so selects are generally expected to be faster than inserts/updates/deletes because of the need to harden log for you. You can read about the specific limits for different reservation sizes are here: DTU Limits and vCore Limits.
One common problem is to do individual operations in a loop (like a cursor or driven from the client). This implies that each statement has a single row updated and thus has to harden each log record serially because the app has to wait for the statement to return before submitting the next statement. You are not hitting that since you are running a big delete as a single statement. That could be slow for other reasons such as:
Locking - if you have other users doing operations on the table, it could block the progress of the delete statement. You can potentially see this by looking at sys.dm_exec_requests to see if your statement is blocking on other locks.
Query Plan choice. If you have to scan a lot of rows to delete a small fraction, you could be blocked on the IO to find them. Looking at the query plan shape will help here, as will set statistics time on (We suggest you change the query to do TOP 100 or similar to get a sense of whether you are doing lots of logical read IOs vs. actual logical writes). This could imply that your on-disk layout is suboptimal for this problem. The general solutions would be to either pick a better indexing strategy or to use partitioning to help you quickly drop groups of rows instead of having to delete all the rows explicitly.
An additional strategy to have better performance with deletes is to perform batching.
As I know SQL Server had a change and the default DOP is 1 on their servers, so if you run the query with OPTION(MAXDOP 0) could help.
Try this:
DELETE FROM _PPL_DETAIL
WHERE RunId <> 229
OPTION (MAXDOP 0);
I am currently experiencing very long sync times on a zumero synced database (well over a minute), and following some profiling, the culprit appears to be a particular query that is taking 20+ seconds (suitably anonymised):
WITH relevant_rvs AS
(
SELECT rv.z_rv AS rv FROM zumero."mydb_089eb7ec0e2e4772ba0dde90170ee368_mysynceddb$z$rv$271340031" rv
WHERE (rv.txid<=913960)
AND NOT EXISTS (SELECT 1 FROM zumero."mydb_089eb7ec0e2e4772ba0dde90170ee368_mysynceddb$z$dd$271340031" dd WHERE dd.rv=rv.z_rv AND (dd.txid<=913960))
)
INSERT INTO #final_included_271340031_e021cfbe1c97213dd5adbacd667c08439fb8c6 (z_rv)
SELECT z$this.z_rv
FROM zumero."mydb_089eb7ec0e2e4772ba0dde90170ee368_mysynceddb$z$271340031" z$this
WHERE (z$this.z_rv IN (SELECT rv FROM relevant_rvs))
AND MyID = (MyID = XXX AND MyOtherField=XXX)
UNION SELECT z$this.z_rv
FROM zumero."mydb_089eb7ec0e2e4772ba0dde90170ee368_mysynceddb$z$old$271340031" z$this
WHERE (z$this.z_rv IN (SELECT rv FROM relevant_rvs))
AND (MyID = XXX AND MyOtherField=XXX)
I have taken the latter SELECT part of the query and ran it in isolation, which reproduces the same poor performance. Interestingly the execution plan recommends an index be applied, but I'm reluctant to go changing the schema of zumero generated tables, is adding indexes to these tables something that can be attempted safely and is it likely to help?
The source tables have 100,000ish records in them and the filter results in each client syncing 100-1000ish records, so not trivial data volumes but levels I would not expect to be causing major issues in terms of query performance.
Does anyone have any experience optimising Zumero sync performance server side? Do any indexes on source tables propagate to these tables? they don't appear to in this case.
Creating a custom index on the z$old table should be safe. I do hope it helps boost your query performance! (And it would be great to see a comment letting us know if it does or not.)
I believe the only issue such an index may cause would be that it could block certain schema changes on the host table. For example, if you tried to DROP the [MyOtherField] column from the host table, the Zumero triggers would attempt to drop the same column from the z$old table as well, and the transaction would fail with an error (which might be a bit surprising, since the index is not on the table being directly acted on).
Another thing to consider: It might also help to give this new index a name that will be recognized/helpful if it ever appears in an error message. Then (as always) feel free to contact support#zumero.com with any further questions or issues if they come up.
I need to update an identity column in a very specific scenario (most of the time the identity will be left alone). When I do need to update it, I simply need to give it a new value and so I'm trying to use a DELETE + INSERT combo.
At present I have a working query that looks something like this:
DELETE Test_Id
OUTPUT DELETED.Data,
DELETED.Moredata
INTO Test_id
WHERE Id = 13
(This is only an example, the real query is slightly more complex.)
A colleague brought up an important point. She asked if this wont cause a deadlock since we are writing and reading from the same table. Although in the example it works fine (half a dozen rows), in a real world scenario with tens of thousands of rows this might not work.
Is this a real issue? If so, is there a way to prevent it?
I set up an SQL Fiddle example.
Thanks!
My first thought was, yes it can. And maybe it is still possible, however in this simplified version of the statement it would be very hard to hit an deadlock. You're selecting a single row for which probably row level locks are acquired plus the fact that the locks required for the delete and the insert are acquired very fast after each other.
I've did some testing against a table holding a million rows execution the statement 5 million times on 6 different connections in parallel. Did not hit a single deadlock.
But add the reallive query, an table with indexes and foreign keys and you just might have a winner. I've had a similar statement which did cause deadlocks.
I have encountered deadlock errors with a similar statement.
UPDATE A
SET x=0
OUTPUT INSERTED.ID, 'a' INTO B
So for this statement to complete mssql needs to take locks for the updates on table A, locks for the inserts on table B and shared (read) locks on table A to validate the foreign key table B has to table A.
And last but not least, mssql decided it would be wise to use parallelism on this particular query causing the statement to deadlock on itself. To resolve this I've simply set "MAXDOP 1" query hint on the statement to prevent parallelism.
There is however no definite answer to prevent deadlocks. As they say with mssql ever so ofter, it depends. You could take an exclusive using the TABLOCKX table hint. This will prevent a deadlock, however it's probably not desirable for other reasons.
An internal application needs to dynamically create SQL tables based on some provided criteria. There are multiple consumer of this application.
IF (NOT EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'dbo' AND TABLE_NAME = 'SomeTableName'))
BEGIN
-- Create table in here.
END
To do this I have the above basic construct within the sproc. I am aware of possible race conditions, so my first solution was to add some locking hints to the SELECT statement to ensure that all other transactions checking for the existance of a table would be blocked until the other transactions had finished. However, no matter which hints I used, this would not work.
My next solution was to wrap the table creation in a TRY..CATCH so that even if it did fail, I could just ignore the error. However, the failure of the CREATE TABLE statement dooms the transaction so I cannot carry on even if I do ignore the error.
My last solution, which works, was to use the TRY..CATCH construct and if an error is raised then GOTO the top of the sproc where a fresh transaction is created and everything goes through fine as the table exists second time round.
I am not happy with the solution as it seems like a hack. Any SQL gurus out there who knows a clean solution to this issue?
Just to clarify, the solution I discussed above does not have a large impact on performance, so I am really looking for a clean solution which doesn't have large performance implications.
Use semaphores (aka manual locking) with sp_getapplock (top of code) and sp_releaseapplock (bottom of code) to ensure one process only.
A 2nd process will fail or wait or timeout based on your sp_getapplock parameters
I have seen sql statements using nolock and with(nolock)
e.g -
select * from table1 nolock where column1 > 10
AND
select * from table1 with(nolock) where column1 > 10
Which of the above statements is correct and why?
The first statement doesn't lock anything, whereas the second one does. When I tested this out just now on SQL Server 2005, in
select * from table1 nolock where column1 > 10 --INCORRECT
"nolock" became the alias, within that query, of table1.
select * from table1 with(nolock) where column1 > 10
performs the desired nolock functionality. Skeptical? In a separate window, run
BEGIN TRANSACTION
UPDATE tabl1
set SomeColumn = 'x' + SomeColumn
to lock the table, and then try each locking statement in its own window. The first will hang, waiting for the lock to be released, and the second will run immediately (and show the "dirty data"). Don't forget to issue
ROLLBACK
when you're done.
The list of deprecated features is at Deprecated Database Engine Features in SQL Server 2008:
Specifying NOLOCK or READUNCOMMITTED
in the FROM clause of an UPDATE or
DELETE statement.
Specifying table
hints without using the WITH keyword.
HOLDLOCK table hint without
parenthesis
Use of a space as a separator between table hints.
The indirect application of table hints to an invocation of a multi-statement table-valued function (TVF) through a view.
They are all in the list of features that will be removed sometimes after the next release of SQL, meaning they'll likely be supported in the enxt release only under a lower database compatibility level.
That being said my 2c on the issue are as such:
Both from table nolock and from table with(nolock) are wrong. If you need dirty reads, you should use appropiate transaction isolation levels: set transaction isolation level read uncommitted. This way the islation level used is explictily stated and controlled from one 'knob', as opposed to being spread out trough the source and subject to all the quirks of table hints (indirect application through views and TVFs etc).
Dirty reads are an abonimation. What is needed, in 99.99% of the cases, is reduction of contention, not read uncommitted data. Contention is reduced by writing proper queries against a well designed schema and, if necessary, by deploying snapshot isolation. The best solution, that solves works almost always save a few extreme cases, is to enable read commited snapshot in the database and let the engine work its magic:
ALTER DATABASE MyDatabase SET ALLOW_SNAPSHOT_ISOLATION ON
ALTER DATABASE MyDatabase SET READ_COMMITTED_SNAPSHOT ON
Then remove ALL hints from the selects.
They are both technically correct, however not using the WITH keyword has been deprecated as of SQL 2005, so get used to using the WITH keyword - short answer, use the WITH keyword.
Use "WITH (NOLOCK)".
Both are syntactically correct.
NOLOCK will become the alias for table1.
WITH (NOLOCK) is often exploited as a magic way to speed up database reads, but I try to avoid using it whever possible.
The result set can contain rows that have not yet been committed, that are often later rolled back.
An error or Result set can be empty, be missing rows or display the same row multiple times.
This is because other transactions are moving data at the same time you're reading it.
READ COMMITTED adds an additional issue where data is corrupted within a single column where multiple users change the same cell simultaneously.
There are other side-effects too, which result in sacrificing the speed increase you were hoping to gain in the first place.
Now you know, never use it again.