Say I have one insert statement going where the values come from a select statement from another table. Therefore, many records are inserted at once. I have another process that just needs to insert a single record. How can I get SQL Server to let the single insert statement execute in a more timely manner? In my observations, the single one gets blocked for quite some time while the multiple insert runs. It would good if the single could "slip in". I tried adding WITH (ROWLOCK) on the inserts.
It is possible that the bulk insertion is escalating to table locks. You can potentially reduce table locks by changing the table DDL to LOCK_ESCALATION=DISABLE, although this could degrade the bulk insert performance.
An alternative is to rewrite the bulk INSERT / SELECT INTO to insert in batches such that it never holds more than 5000 locks at a time.. This would however change the scope of each unit of work, as you will now commit after each smaller batch, which may not be desirable.
Related
I have a long running stored procedure with lot of statements. After analyzing identified few statements which are taking most time. Those statements are all update statements.
Looking at the execution plan, the query scans the source table in parallel in few seconds, and then passed it to gather streams operation which then passes to
This is somewhat similar to below, and we see same behavior with the index creation statements too causing slowness.
https://brentozar.com/archive/2019/01/why-do-some-indexes-create-faster-than-others/
Table has 60 million records and is a heap as we do lot of data loads, updates and deletes.
Reading the source is not a problem as it completes in few seconds, but actual update which happens serially is taking most time.
A few suggestions to try:
if you have indexes on the target table, dropping them before and recreating after should improve insert performance.
Add insert into [Table] with (tablock) hint to the table you are inserting into, this will enable sql server to lock the table exclusively and will allow the insert to also run in parallel.
Alternatively if that doesn't yield an improvement try adding a maxdop 1 hint to the query.
How often do you UPDATE the rows in this heap?
Because, unlike clustered indexes, heaps will use a RID to find specific rows. But the thing is that (unless you specifically rebuild this) when you update a row, the last row will still remain where it was and now point to the new location instead, increasing the number of lookups that is needed for each time you perform an update on a row.
I don't really think that is something that will be affected here, but could you possible see what happens if you add a clustered index on the table and see how the update times are affected?
Also, I don't assume you got some heavy trigger on the table, doing a bunch of stuff as well, right?
Additionally, since you are referring to an article by Brent Ozar, he does advocate to break updates into batches of no more than 4000 rows a time, as that has both been proven to be the fastest and will be below the 5000 rows X-lock that will occur during updates.
Suppose I have a SQL Server table that has millions of rows and receives over 2000 inserts per minute. A separate process needs to do a bulk update on this table, let's say with a where clause that will update 1000 rows. But it doesn't care about performance and could optionally run 1000 single-row updates using the primary key.
If the bulk update runs too long, it will block the incoming insertions, right? Whereas updating rows individually will allow insertions to squeak through the cracks and not block? So from the standpoint of optimizing performance for the insertions, am I better off running the updates one row at a time?
Updates will not block the insert but you might get an unexpected behavior if the where condition of the where condition is not applied to the new inserted rows.. So it's better to review the logic of the application to make sure that the new inserted rows are not needed in the update.
But in general the bulk update is much better than single updates.
I need to update an identity column in a very specific scenario (most of the time the identity will be left alone). When I do need to update it, I simply need to give it a new value and so I'm trying to use a DELETE + INSERT combo.
At present I have a working query that looks something like this:
DELETE Test_Id
OUTPUT DELETED.Data,
DELETED.Moredata
INTO Test_id
WHERE Id = 13
(This is only an example, the real query is slightly more complex.)
A colleague brought up an important point. She asked if this wont cause a deadlock since we are writing and reading from the same table. Although in the example it works fine (half a dozen rows), in a real world scenario with tens of thousands of rows this might not work.
Is this a real issue? If so, is there a way to prevent it?
I set up an SQL Fiddle example.
Thanks!
My first thought was, yes it can. And maybe it is still possible, however in this simplified version of the statement it would be very hard to hit an deadlock. You're selecting a single row for which probably row level locks are acquired plus the fact that the locks required for the delete and the insert are acquired very fast after each other.
I've did some testing against a table holding a million rows execution the statement 5 million times on 6 different connections in parallel. Did not hit a single deadlock.
But add the reallive query, an table with indexes and foreign keys and you just might have a winner. I've had a similar statement which did cause deadlocks.
I have encountered deadlock errors with a similar statement.
UPDATE A
SET x=0
OUTPUT INSERTED.ID, 'a' INTO B
So for this statement to complete mssql needs to take locks for the updates on table A, locks for the inserts on table B and shared (read) locks on table A to validate the foreign key table B has to table A.
And last but not least, mssql decided it would be wise to use parallelism on this particular query causing the statement to deadlock on itself. To resolve this I've simply set "MAXDOP 1" query hint on the statement to prevent parallelism.
There is however no definite answer to prevent deadlocks. As they say with mssql ever so ofter, it depends. You could take an exclusive using the TABLOCKX table hint. This will prevent a deadlock, however it's probably not desirable for other reasons.
I've a running system where data is inserted periodically into MS SQL DB and web application is used to display this data to users.
During data insert users should be able to continue to use DB, unfortunatelly I can't redesign the whole system right now. Every 2 hours 40k-80k records are inserted.
Right now the process looks like this:
Temp table is created
Data is inserted into it using plain INSERT statements (parameterized queries or stored proceuders should improve the speed).
Data is pumped from temp table to destination table using INSERT INTO MyTable(...) SELECT ... FROM #TempTable
I think that such approach is very inefficient. I see, that insert phase can be improved (bulk insert?), but what about transfering data from temp table to destination?
This is waht we did a few times. Rename your table as TableName_A. Create a view that calls that table. Create a second table exactly like the first one (Tablename_B). Populate it with the data from the first one. Now set up your import process to populate the table that is not being called by the view. Then change the view to call that table instead. Total downtime to users, a few seconds. Then repopulate the first table. It is actually easier if you can truncate and populate the table becasue then you don't need that last step, but that may not be possible if your input data is not a complete refresh.
You cannot avoid locking when inserting into the table. Even with BULK INSERT this is not possible.
But clients that want to access this table during the concurrent INSERT operations can do so when changing the transaction isolation level to READ UNCOMMITTED or by executing the SELECT command with the WITH NOLOCK option.
The INSERT command will still lock the table/rows but the SELECT command will then ignore these locks and also read uncommitted entries.
In Microsoft SQL Server :
I've added an insert trigger my table ACCOUNTS that does an insert into table BLAH based upon the inserted values.
The inserts come from only one place, and those happen one at a time. (By that, I mean, that there's never two inserts in a transaction - two web users could, theoretically click submit and have their inserts done in a near-simulataneous way.)
Do I need to adapt the trigger to handle more than one row being in inserted, the special table created for triggers - or does each individual insert transaction launch the trigger separately?
Each insert calls the trigger. However, if a single insert adds more than one row the trigger is only called once, so your trigger has to be able to handle multiple records.
The granularity is at the INSERT statement level not at the transaction level.
So no, if you have two transactions inserting into the same table they will each call the trigger ATOMICALLY.
BOb
in your situation each insert happens in its own transaction and fires off the trigger individually, so you should be fine. if there was ever a circumstance where you had two inserts within the same transaction you would have to modify the trigger to do either a set based insert from the 'inserted' table or some kind of cursor if additional processing is necessary.
If you do only one insert in a transaction, I don't see any reason for more rows to be in inserted, except if there was a possibility of recursive trigger calls.
Still, it could cause you troubles if you'd change the behavior of your application in future and you forget to change the triggers. So just to be sure, I would rather implement the trigger as if it could contain multiple rows in inserted.