Consider this statement:
update TABLE1
set FormatCode = case T.FormatCode when null then TABLE1.FormatCode else T.FormatCode end,
CountryCode = case T.CountryCode when null then TABLE1.CountryCode else T.CountryCode end
<SNIP ... LOTS of similar fields being updated>
FROM TABLE2 AS T
WHERE TABLE1.KEYFIELD = T.KEYFIELD
TABLE1 is used by other applications and so locking on it should be minimal
TABLE2 is not used by anybody else so I do not care about it.
TABLE1 and TABLE2 contain 600K rows each.
Would the above statement cause a table lock on TABLE1?
How can I modify it to cause the minimal lock on it ?
Maybe use a cursor to read the rows of TABLE2 one by one and then for each row update the respective row of TABLE1?
Sql will use row locks first. If enough rows in a index page is locked SQL will issue a page lock. If enough pages are locked SQL will issue a table lock.
So it really depends on how many locks is issued. You could user the locking hint ROWLOCK in your update statement. The down side is that you will probably have thousand of row lock instead of hundreds of page locks or one table lock. Locks use resources so while ROWLOCK hints will probably not issue a table lock it might even be worse as it could starve your server of resources and slowing it down in anycase.
You could batch the update say 1000 at a time. Cursors is really going to news things up even more. Experiment monitor analyse the results and make a choice based on the data you have gathered.
As marc_s has suggested introducing a more restrictive WHERE clause to reduce the number of rows should help here.
Since your update occurs nightly it seems you'll be looking to only update the records that have updated since the previous update occurred (ie a days worth of updates). But this will only benefit you if a subset of the records have changed rather than all of then.
I'd probably try to SELECT out the Id's for the rows that have changed into a temp table and then joining the temp table as part of the update. To determine the list of Id's a couple of options come to mind on how you can do this such as making use of a last changed column on TABLE2 (if TABLE2 has one); alternatively you could compare each field between TABLE1 and TABLE2 to see if they differ (watch for nulls), although this would be a lot of extra sql to include and probably a maintenance pain. 3rd option that I can think of would be to have an UPDATE trigger against TABLE2 to insert the KEYFIELD of the rows as they are updated during the day into our temp table, the temp table could be cleared down following your nightly update.
Related
We have a stored procedure which loads order details about an order. We always want the latest information about an order, so order details for the order are regenerated every time, when the stored procedure is called. We are using SQL Server 2016.
Pseudo code:
DELETE by clustered index based on order identifier
INSERT into the table, based on a huge query containing information about order
When multiple end-users are executing the stored procedure concurrently, there is a blocking created on orderdetails table. Once the first caller is done, second caller is queued, followed by third caller. So, the time for the generation of the orderdetails increases as time goes by. This is happening especially in the cases of big orders containing details rows in > 100k or 1 or 2 million, as there is table level lock is happening.
The approach we took
We partitioned the table based on the last digit of the order identifier for concurrent orderdetails loading. This improves the performance in the case of first time orderdetails loading, as there are no deletes. But, second time onwards, INSERT in first session is causing blocking for other sessions DELETE. The other sessions are blocked till first session is done with INSERT.
We are considering creation of separate orderdetails table for every order to avoid this concurrency issues.
Question
Can you please suggest some approach, which will support concurrent DELETE & INSERT scenario ?
We solved the contention issue by going for temporary table for orderdetails. We found that huge queries are taking longer SELECT time and this longer time was contributing to longer table level locks on the orderdetails table.
So, we first loaded data into temporary table #orderdetail and then went for DELETE and INSERT in the orderdetail table.
As the orderdetail table is already partitioned, DELETE were faster and INSERT were happening in parallel. INSERT was also very fast here, as it is simple table scan from #orderdetail table.
You can give a look to the Hekaton Engine. It is available even in SQL Server Standard Edition if you are using SP1.
If this is too complicated for implementation due to hardware or software limitations, you can try to play with the Isolation Levels of the database. Sometimes, queries that are reading huge amount of data are blocked or even deadlock victims of queries which are modifying parts of these data. You can ask yourself do you need to guarantee that the data read by the user is valid or you can afford for example some dirty reads?
I want to place DB2 Triggers for Insert, Update and Delete on DB2 Tables heavily used in parallel online Transactions. The tables are shared by several members on a Sysplex, DB2 Version 10.
In each of the DB2 Triggers I want to insert a row into a central table and have one background process calling a Stored Procedure to read this table every second to process the newly inserted rows, ordered by sequence of the insert (sequence number or timestamp).
I'm very concerned about DB2 Index locking contention and want to make sure that I do not introduce Deadlocks/Timeouts to the applications with these Triggers.
Obviously I would take advantage of DB2 Features to reduce locking like rowlevel locking, but still see no real good approach how to avoid index contention.
I see three different options to select the newly inserted rows.
Put a sequence number in the table and the store the last processed sequence number in the background process. I would do the following select Statement:
SELECT COLUMN_1, .... Column_n
FROM CENTRAL_TABLE
WHERE SEQ_NO > 'last-seq-number'
ORDER BY SEQ_NO;
Locking Level must be CS to avoid selecting uncommited rows, which will be later rolled back.
I think I need one Index on the table with SEQ_NO ASC
Pro: Background process only reads rows and makes no updates/deletes (only shared locks)
Neg: Index contention because of ascending key used.
I can clean-up processed records later (e.g. by rolling partions).
Put a Status field in the table (processed and unprocessed) and change the Select as follows:
SELECT COLUMN_1, .... Column_n
FROM CENTRAL_TABLE
WHERE STATUS = 'unprocessed'
ORDER BY TIMESTAMP;
Later I would update the STATUS on the selected rows to "processed"
I think I need an Index on STATUS
Pro: No ascending sequence number in the index and no direct deletes
Cons: Concurrent updates by online transactions and the background process
Clean-up would happen in off-hours
DELETE the processed records instead of the status field update.
SELECT COLUMN_1, .... Column_n
FROM CENTRAL_TABLE
ORDER BY TIMESTAMP;
Since the table contains very few records, no index is required which could create a hot spot.
Also I think I could SELECT with Isolation Level UR, because I would detect potential uncommitted data on the later delete of this row.
For a Primary Key index I could use GENERATE_UNIQUE,which is random an not ascending.
Pro: No Index hot spot and the Inserts can be spread across the tablespace by random UNIQUE_ID
Con: Tablespace scan and sort on every call of the Stored Procedure and deleting records in parallel to the online inserts.
Looking forward what the community thinks about this problem. This must be a pretty common problem e.g. SAP should have a similar issue on their Batch Input tables.
I tend to favour Option 3, because it avoids index contention.
May be there is still another solution in your minds out there.
I think you are going to have numerous performance problems with your various solutions.
(I know premature optimazation is a sin, but experience tells us that some things are just not going to work in a busy system).
You should be able to use DB2s autoincrement feature to get your sequence number, with little or know performance implications.
For the rest perhaps you should look at a Queue based solution.
Have your trigger drop the operation (INSERT/UPDATE/DELETE) and the keys of the row into a MQ queue,
Then have a long running backgound task (in CICS?) do your post processing as its processing one update at a time you should not trip over yourself. Having a single loaded and active task with the ability to batch up units of work should give you a throughput in the order of 3 to 5 hundred updates a second.
We are trying to determine more efficient ways to perform some database operations.
One of the issues that we have is with an ancient primary key system where the primary key for a new record is selected by finding the MAX value in the table and then adding 1 (we cannot change this implementation, please don't suggest this as an answer).
There are some different approaches that we can take to resolve this issue (table-valued parameters, temp tables, etc), but we can never assume that another process won't insert another record during this process and business rules will not allow us to lock the table.
So, the crux of my question is, if we get the current MAX value in a sub-query using an UPDLOCK hint, will the lock hint last for the life of the containing query?
For example:
INSERT
INTO table1
( PKColumn,
DataColumn1,
DataColumn2 )
VALUES SELECT MAX(ISNULL(PKColumn, 0) + 1) FROM table1 WITH (UPDLOCK)) + RowNumber ,
DataColumn1 ,
DataColumn2
FROM #Table1Temp
If we use this to insert 100,000 records, for example, will the UPDLOCK hint hold on the table until all records are inserted or is it released as soon as the initial value is retrieved?
Specifies that update locks are to be taken and held until the
transaction completes. UPDLOCK takes update locks for read operations
only at the row-level or page-level. If UPDLOCK is combined with
TABLOCK, or a table-level lock is taken for some other reason, an
exclusive (X) lock will be taken instead.
So yes. The transaction will last at least as long as that statement. (Possibly longer if you aren't using auto commit transactions and have multiple statements in a transaction)
I need to update an identity column in a very specific scenario (most of the time the identity will be left alone). When I do need to update it, I simply need to give it a new value and so I'm trying to use a DELETE + INSERT combo.
At present I have a working query that looks something like this:
DELETE Test_Id
OUTPUT DELETED.Data,
DELETED.Moredata
INTO Test_id
WHERE Id = 13
(This is only an example, the real query is slightly more complex.)
A colleague brought up an important point. She asked if this wont cause a deadlock since we are writing and reading from the same table. Although in the example it works fine (half a dozen rows), in a real world scenario with tens of thousands of rows this might not work.
Is this a real issue? If so, is there a way to prevent it?
I set up an SQL Fiddle example.
Thanks!
My first thought was, yes it can. And maybe it is still possible, however in this simplified version of the statement it would be very hard to hit an deadlock. You're selecting a single row for which probably row level locks are acquired plus the fact that the locks required for the delete and the insert are acquired very fast after each other.
I've did some testing against a table holding a million rows execution the statement 5 million times on 6 different connections in parallel. Did not hit a single deadlock.
But add the reallive query, an table with indexes and foreign keys and you just might have a winner. I've had a similar statement which did cause deadlocks.
I have encountered deadlock errors with a similar statement.
UPDATE A
SET x=0
OUTPUT INSERTED.ID, 'a' INTO B
So for this statement to complete mssql needs to take locks for the updates on table A, locks for the inserts on table B and shared (read) locks on table A to validate the foreign key table B has to table A.
And last but not least, mssql decided it would be wise to use parallelism on this particular query causing the statement to deadlock on itself. To resolve this I've simply set "MAXDOP 1" query hint on the statement to prevent parallelism.
There is however no definite answer to prevent deadlocks. As they say with mssql ever so ofter, it depends. You could take an exclusive using the TABLOCKX table hint. This will prevent a deadlock, however it's probably not desirable for other reasons.
I have a SQL Server 2012 table that will contain 2.5 million rows at any one time. Items are always being written into the table, but the oldest rows in the table get truncated at the end of each day during a maintenance window.
I have .NET-based reporting dashboards that usually report against summary tables though on the odd occasion it does need to fetch a few rows from this table - making use of the indexes set.
When it does report against this table, it can prevent new rows being written to this table for up to 1 minute, which is very bad for the product.
As it is a reporting platform and the rows in this table never get updated (only inserted - think Twitter streaming but for a different kind of data) it isn't always necessary to wait for a gap in the transactions that cause rows to get inserted into this table.
When it comes to selecting data for reporting, would it be wise to use a SNAPSHOT isolation level within a transaction to select the data, or NOLOCK/READ UNCOMITTED? Would creating a SQLTransaction around the select statement cause the insert to block still? At the moment I am not wrapping my SQLCommand instance in a transaction, though I realise this will still cause locking regardless.
Ideally I'd like an outcome where the writes are never blocked, and the dashboards are as responsive as possible. What is my best play?
Post your query
In theory a select should not be blocking inserts.
By default a select only takes a shared lock.
Shared locks are acquired during read operations automatically and prevent the user from modifying data.
This should not block inserts to otherTable or joinTable
select otherTable.*, joinTable.*
from otherTable
join joinTable
on otherTable.jionID = joinTable.ID
But it does have the overhead of acquiring a read lock (it does not know you don't update).
But if it is only fetching a few rows from joinTable then it should only be taking a few shared locks.
Post your query, query plan, and table definitions.
I suspect you have some weird stuff going on where it is taking a lot more locks than it needs.
It may be taking lock on each row or it may be escalating to page lock or table lock.
And look at the inserts. Is it taking some crazy locks it does not need to.