I am having a problem and I want to ask some opinions on the best approach to accomplish this task.
The process that I am implementing is a purge solution for big databases. The tables can have a variable amount of rows that can go from a few thousands to big millions. What the process will do is to delete just the data for a specific year. Partitioning is not an option here.
I have a some initial key tables that were chosen to start the process, and from these tables I will recursively delete data, using some stored procedures built for this purpose (in order to maintain data integrity and delete only the necessary data).
I am using SSIS to be able to start the data deletion in multiple tables at the same time but I am having problems with locks.
The deletion process as the following steps:
Drop all indexes on table
Delete data in batches
Rebuild the primary key indexes (this needs to be done for performance, because there are table with some primary keys non clustered)
I have tried:
Using transactions and named transactions (still have locks)
Using sp_getapplock (still have locks)
In SSIS I have serializable as the default isolation for the transactions.
My main question is, because I have to do some DML and DDL in the same transaction, and this will take some time to execute, I want all tasks that need to obtain a certain lock to be waiting until the lock is released and not be targeted as deadlock victim (very the same as a mutex).
The database will have no activity during this operation, only this process will be executed.
This is the code I use to delete the data, and here is where the deadlocks are coming from.
select #query = N'
BEGIN TRANSACTION;
DECLARE #result int;
EXEC #result = sp_getapplock #Resource = ''[dbo].'+#dep_tbl_nm+''', #LockMode = ''Exclusive'', #LockOwner = ''Session'', #LockTimeout = -1;
EXEC CreateIndexes ''dbo.'+#dep_tbl_nm+'''
EXEC CreateCoveringIndexes '''+#tbl_nm+''','''+#dep_tbl_nm+''','''+#dep_tmp_tbl_nm+'''
WHILE(1=1)
BEGIN
DELETE TOP(50000) FROM a OUTPUT deleted.* into '+#dep_tmp_tbl_nm+' FROM dbo.'+ #dep_tbl_nm + ' AS A INNER JOIN '+#tmp_tbl_nm+' AS B ON ' + #on_list +'
if(##ROWCOUNT = 0)
break;
END
exec (''ALTER INDEX ALL ON '+#dep_tbl_nm+' REBUILD WITH (FILLFACTOR = 80)'')
COMMIT TRANSACTION;'
print(#query)
exec(#query)
Comments are welcome
Regards
Related
I was working mostly on PostgreSQL, but recently I was assigned to project with SqlServer and I encountered very strange behavior of this engine. I am using transaction in my code and connect with server via System.Data.SqlClient library. The code in transaction is approximately 1000 lines long, so I would like to not copy it here, but transaction is handled via code below:
using (var transaction = connection.BeginTransaction(IsolationLevel.ReadCommited))
{
//here code goes
//1. inserting new table metadata via inserts and updates
//2. creating new tables according to users project
//3. post execute actions via inserts and updates
//here is intended transaction freeze
await Task.Delay(1000 * 60 * 2);
}
During this execution I cannot perform any operation on database (query execution in SSMS or some code execution in application - doesn't matter). Simple selects f.e. SELECT * FROM "TableA" hangs, retrieving database properties in SSMS hangs etc. Any independent query waits for this one transaction to be completed.
I found several articles and answers here on SO, and based on those I tried following solutions:
Use WITH (NOLOCK) or WITH (READPAST) in SELECT statement
Changing database property Is Read Commited Snapshot ON to true
Changing transaction isolation level in code (all possible levels were tested)
None of the above solutions works.
I tested on 3 different computers: desktop, two laptops - the same behavior (SqlServer and SSMS was installed with default options).
In this thread: Understanding locking behavior in SQL Server there are some explanation of transaction isolation levels and locks but the problem is that WITH (NOLOCK) doesn't work for me as mentioned in 1).
This is very big problem for me, because my asynchronous application works synchronously because of that weird locks.
Oracle and postgres databases works perfectly fine, the problem concerns SqlServer only.
I don't use EntityFramework - I handle connections myself.
Windows 10 Pro
Microsoft SQL Server Developer (64-bit) version 15.0.2000.5
System.Data.SqlClient version 4.8.3
.NET 6.0
Any clues?
Update 1:
As pointed out in comments I have indeed schema changes in my transaction - CREATE TABLE and ALTER TABLE statements mixed with standard UPDATES and SELECTS. My app allows user to create own tables (in limited functionality) and when this table is registered in table via INSERT there are some CREATES to adjust table structure.
Update 2:
I can perform SELECT * FROM sys.dm_tran_locks
I executed DBCC SQLPERF ('sys.dm_os_wait_stats', CLEAR);
The problem remains.
The cause of the locking issue is DDL (CREATE TABLE, etc.) within a transaction. This will acquire and hold restrictive locks on system table meta-data and block other database activity that need access to object meta-data until the transaction is committed.
This is an app design problem as one should not routinely execute DDL functions in application code. If that design cannot be easily remediated, perform the DDL operations separately in a short transaction (or with utocommit statements without an explict transaction) and handle DDL rollback in code.
You can use this useful store proc which I picked up somewhere along my travels. It recently helped me see the locking on a table and showed that after setting READ UNCOMMITED it was no longer doing row/page/table locks, but still had the schema lock. I believe you may have schema locks if you are modifying them! and also as commented don't keep a transaction open long, in and out is key.
What this does is runs the stored proc every seconds 20 times, so you will get a snapshot of locking, a really useful stored proc to remember.
EXEC [dbo].CheckLocks #Database = 'PriceBook'
WAITFOR DELAY '00:00:01'
GO 20
The stored proc is as follows
/*
This script can be run to find locks at the current time
We can run it as follows:
EXEC [dbo].CheckLocks #Database = 'PriceBook'
WAITFOR DELAY '00:00:01'
GO 10
*/
CREATE OR ALTER PROCEDURE [dbo].[CheckLocks]
#Database NVARCHAR(256)
AS
BEGIN
-- Get the sp_who details
IF object_id('tempdb..#WhoDetails') IS NOT NULL
BEGIN
DROP TABLE #WhoDetails
END
CREATE TABLE #WhoDetails (
[spid] INT,
[ecid] INT,
[status] VARCHAR(30),
[loginame] VARCHAR(128),
[hostname] VARCHAR(128),
[blk] VARCHAR(5),
[dbname] VARCHAR(128),
[cmd] VARCHAR(128),
[request_id] INT
)
INSERT INTO #WhoDetails EXEC sp_who
-- Get the sp_lock details
IF object_id('tempdb..#CheckLocks') IS NOT NULL
BEGIN
DROP TABLE #CheckLocks
END
CREATE TABLE #CheckLocks (
[spid] int,
[dbid] int,
[ObjId] int,
[IndId] int,
[Type] char(4),
[Resource] nchar(32),
[Mode] char(8),
[Status] char(6)
)
INSERT INTO #CheckLocks EXEC sp_lock
SELECT DISTINCT
W.[loginame],
L.[spid],
L.[dbid],
db_name(L.dbid) AS [Database],
L.[ObjId],
object_name(objID) AS [ObjectName],
L.[IndId],
L.[Type],
L.[Resource],
L.[Mode],
L.[Status]--,
--ST.text,
--IB.event_info
FROM #CheckLocks AS L
INNER JOIN #WhoDetails AS W ON W.spid = L.spid
INNER JOIN sys.dm_exec_connections AS EC ON EC.session_id = L.spid
--CROSS APPLY sys.dm_exec_sql_text(EC.most_recent_sql_handle) AS ST
--CROSS APPLY sys.dm_exec_input_buffer(EC.session_id, NULL) AS IB -- get the code that the session of interest last submitted
WHERE L.[dbid] != db_id('tempdb')
AND L.[Type] IN ('PAG', 'EXT', 'TAB')
AND L.[dbid] = db_id(#Database)
/*
https://learn.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-lock-transact-sql?view=sql-server-ver15
Lock modes are as follows
------------------------------
S = Shared
U = Update
X = Exclusive
IS = Indent Shared
IS = Intent Update
IX = Intent Exclusive
Sch-S = Schema Stability lock so no we cant remove tables or indexes in use
Lock Type are as follows:
------------------------------
RID = Single row lock
KEY = Lock within an index that protects a range of keys
PAG = Page level lock
EXT = Extend Lock
TAB = Table Lock
DB = Database lock
*/
END
This is what you might see if you can catch the locking, this was before an after example, left and right.
Below is an outline of a SQL execution framework design using Service Broker that I have been playing with. I've outlined the process and have asked some questions through (highlight using a block quote) and would be interested in hearing any advice on the design.
Overview
I have a an ETL operation that needs to take data out of 5 databases and move it into 150 using select/insert statements or stored procedures. The result is about 2,000 individual queries, taking between 1 second to 1 hour each.
Each SQL query inserts data only. There is no need for data to be returned.
The operation can be broken up into 3 steps:
Pre-ETL
ETL
Post-ETL
The queries in each step can be executed in any order, but the steps have to stay in order.
Method
I am using Service Broker for asynchronous/parallel execution.
Any advice on how to tune service broker (e.g. any specific options to look at or guide for setting the number of queue workers?
Service Broker Design
Initiator
The initiator sends an XML message containing the SQL query to the Unprocessed queue, with an activation stored procedure called ProcessUnprocessedQueue. This process is wrapped in a try/catch in a transaction, rolling back the transaction when there is an exception.
ProcessUnpressedQueue
ProcessUnprocessedQueue passes the XML to procedure ExecSql
ExecSql - SQL Execution and Logging
ExecSql then handles the SQL execution and logging:
The XML is parsed, along with any other data about the execution that is going to be logged
Before the execution, a logging entry is inserted
If the transaction is started in the initiator, can I ensure the log entry insert is always committed if the outer transaction in the initiator is rolled back?
Something like SAVE TRANSACTION is not valid here, correct?
Should I not manipulate the transaction here, execute the query in a try/catch and, if it goes to the catch, insert a log entry for the exception and throw the exception since it is in the middle of the transaction?
The query is executed
Alternative Logging Solution?
I need to log:
The SQL query executed
Metadata about the operation
The time it takes for each process to finish
This is why I insert one row at the start and one at the end of the process
Any exceptions, if they exist
Would it be better to have an In-Memory OLTP table that contains the query information? So, I would have INSERT a row before the start of an operation and then do an UPDATE or INSERT to log exceptions and execution times. After the batch is done, I would then archive the data into a table stored to the disk to prevent the table from getting too big.
ProcessUnprocessedQueue - Manually process the results
After the execution, ProcessUnprocessedQueue gets back an updated version of the XML (to determine if the execution was successful, or other data about the transaction, for post-processing) and then sends that message to the ProcessedQueue, which does not have an activation procedure, so it can be manually processed (I need to know when a batch of queries has finished executing).
Processing the Queries
Since the ETL can be broken out into 3 steps, I create 3 XML variables where I will add all of the queries that are needed in the ETL operation, so I will have something like this:
#preEtlQueue xml
200 queries
#etlQueue xml
1500 queries
#postEtlQueue xml
300 queries
Why XML?
The XML queue variable is passed between different stored procedures as an OUTPUT parameter that updates it's values and/or add SQL queries to it. This variable needs to be written and read, so an alternative could be something like a global temp table or a persistent table.
I then process the XML variables:
Use a cursor to loop through the queries and send them to the service broker service.
Each group of queries contained in the XML variable is sent under the same conversation_group_id.
Values such as the to/from service, message type, etc. are all stored in the XML variable.
After the messages are sent to Service Broker, use a while loop to continuously check the ProcessedQueue until all the messages have been processed.
This implements a timeout to avoid an infinite loop
I'm thinking of redesigning this. Should I add an activation procedure on ProcessedQueue and then have that procedure insert the processed results into a physical table? If I do it this way, I wouldn't be able to use RECEIVE instead of a WHILE loop to check for processed items. Does that have any disadvantages?
I haven't built anything as massive as what you are doing now, but I will give you what worked for me, and some general opinions...
My preference is to avoid In-Memory OLTP and write everything to durable tables and keep the message queue as clean as possible
Use fastest possible hard drives in the server, write speed equivalent of NVMe or faster with RAID 10 etc.
I grab every message off the queue as soon as it hits and write it to a table I have named "mqMessagesReceived" (see code below, my all-purpose MQ handler named mqAsyncQueueMessageOnCreate)
I use a trigger in the "mqMessagesReceived" table that does a lookup to find which StoredProcedure to execute to process each unique message (see code below)
Each message has an identifier (in my case, I'm using the originating Tablename that wrote a message to the queue) and this identifier is used as a key for a lookup query run inside the the trigger of the mqMessagesReceived table, to figure out which subsequent Stored Procedure needs to be to run, to process each received message correctly.
Before sending a message on the MQ,
Can make a generic variable from the calling side (e.g. if a trigger is putting messages onto the MQ)
SELECT #tThisTableName = OBJECT_NAME(parent_object_id) FROM sys.objects
WHERE sys.objects.name = OBJECT_NAME(##PROCID)
AND SCHEMA_NAME(sys.objects.schema_id) = OBJECT_SCHEMA_NAME(##PROCID);
A configuration table is the lookup data for matching tablename with StoredProcedure that needs to be run, to process the MQ data that arrived and was written to the mqMessagesReceived table.
Here is the definition of that lookup table
CREATE TABLE [dbo].[mqMessagesConfig](
[ID] [int] IDENTITY(1,1) NOT NULL,
[tSourceTableReceived] [nvarchar](128) NOT NULL,
[tTriggeredStoredProcedure] [nvarchar](128) NOT NULL,
CONSTRAINT [PK_mqMessagesConfig] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
Here is the activation stored procedure that gets run as a message hits the queue
CREATE PROCEDURE [dbo].[mqAsyncQueueMessageOnCreate]
AS
BEGIN
SET NOCOUNT ON
DECLARE
#h UNIQUEIDENTIFIER,
#t sysname,
#b varbinary(200),
#hand VARCHAR(36),
#body VARCHAR(2000),
#sqlcleanup nvarchar(MAX)
-- Get all of the messages on the queue
-- the WHILE loop is infinite, until BREAK is received when we get a null handle
WHILE 1=1
BEGIN
SET #h = NULL
--Note the semicolon..!
;RECEIVE TOP(1)
#h = conversation_handle,
#t = message_type_name,
#b = message_body
FROM mqAsyncQueue
--No message found (handle is now null)
IF #h IS NULL
BEGIN
-- all messages are now processed, but we still have the #hand variable saved from processing the last message
SET #sqlcleanup = 'EXEC [mqConversationsClearOne] #handle = N' + char(39) + #hand + char(39) + ';';
EXECUTE(#sqlcleanup);
BREAK
END
--mqAsyncMessage message type received
ELSE IF #t = 'mqAsyncMessage'
BEGIN
SET #hand = CONVERT(varchar(36),#h);
SET #body = CONVERT(varchar(2000),#b);
INSERT mqMessagesReceived (tMessageType, tMessageBody, tMessageBinary, tConversationHandle)
VALUES (#t, #body, #b, #hand);
END
--unknown message type was received that we dont understand
ELSE
BEGIN
INSERT mqMessagesReceived (tMessageBody, tMessageBinary)
VALUES ('Unknown message type received', CONVERT(varbinary(MAX), 'Unknown message type received'))
END
END
END
CREATE PROCEDURE [dbo].[mqConversationsClearOne]
#handle varchar(36)
AS
-- Note: you can check the queue by running this query
-- SELECT * FROM sys.conversation_endpoints
-- SELECT * FROM sys.conversation_endpoints WHERE NOT([State] = 'CO')
-- CO = conversing [State]
DECLARE #getid CURSOR
,#sql NVARCHAR(MAX)
,#conv_id NVARCHAR(100)
,#conv_handle NVARCHAR(100)
-- want to create a chain of statements like this, one per conversation
-- END CONVERSATION 'FE851F37-218C-EA11-B698-4CCC6AD00AE9' WITH CLEANUP;
-- END CONVERSATION 'A4B4F603-208C-EA11-B698-4CCC6AD00AE9' WITH CLEANUP;
SET #getid = CURSOR FOR
SELECT [conversation_id], [conversation_handle]
FROM sys.conversation_endpoints
WHERE conversation_handle = #handle;
OPEN #getid
FETCH NEXT
FROM #getid INTO #conv_id, #conv_handle
WHILE ##FETCH_STATUS = 0
BEGIN
SET #sql = 'END CONVERSATION ' + char(39) + #conv_handle + char(39) + ' WITH CLEANUP;'
EXEC sys.sp_executesql #stmt = #sql;
FETCH NEXT
FROM #getid INTO #conv_id, #conv_handle --, #conv_service
END
CLOSE #getid
DEALLOCATE #getid
and the table named "mqMessagesReceived" has this trigger
CREATE TRIGGER [dbo].[mqMessagesReceived_TriggerUpdate]
ON [dbo].[mqMessagesReceived]
AFTER INSERT
AS
BEGIN
DECLARE
#strMessageBody nvarchar(4000),
#strSourceTable nvarchar(128),
#strSourceKey nvarchar(128),
#strConfigStoredProcedure nvarchar(4000),
#sqlRunStoredProcedure nvarchar(4000),
#strErr nvarchar(4000)
SELECT #strMessageBody= ins.tMessageBody FROM INSERTED ins;
SELECT #strSourceTable = (select txt_Value from dbo.fn_ParseText2Table(#strMessageBody,'|') WHERE Position=2);
SELECT #strSourceKey = (select txt_Value from dbo.fn_ParseText2Table(#strMessageBody,'|') WHERE Position=3);
-- look in mqMessagesConfig to find the name of the final stored procedure
-- to run against the SourceTable
-- e.g. #strConfigStoredProcedure = mqProcess-tblLabDaySchedEventsMQ
SELECT #strConfigStoredProcedure =
(select tTriggeredStoredProcedure from dbo.mqMessagesConfig WHERE tSourceTableReceived = #strSourceTable);
SET #sqlRunStoredProcedure = 'EXEC [' + #strConfigStoredProcedure + '] #iKey = ' + #strSourceKey + ';';
EXECUTE(#sqlRunStoredProcedure);
INSERT INTO [mqMessagesProcessed]
(
[tMessageBody],
[tSourceTable],
[tSourceKey],
[tTriggerStoredProcedure]
)
VALUES
(
#strMessageBody,
#strSourceTable,
#strSourceKey,
#sqlRunStoredProcedure
);
END
Also, just some general SQL Server tuning advice that I found I also had to do (for dealing with a busy database)
By default there is just one single TempDB file per SQL Server, and TempDB has initial size of 8MB
However TempDB gets reset back to the initial 8MB size, every time the server reboots, and this company was rebooting the server every weekend via cron/taskscheduler.
The problem we saw was slow database and lots of record locks but only first thing Monday morning when everyone was hammering the database at once as they began their work-week.
When TempDB gets automatically re-sized, it is "locked" and therefore nobody at all can use that single TempDB (which is why the SQL Server was regularly becoming non-responsive)
By Friday the TempDB had grown to over 300MB.
So... to solve the following best practice recommendation, I created one TempDB file per vCPU, so I created 8 TempDB files, and I have distributed them across two available hard drives on that server, and most importantly, set their initial size to more than we need (200MB each is what I chose).
This fixed the problem with the SQL Server slowdown and record locking that was experienced every Monday morning.
I have a database table with thousands of entries. I have multiple worker threads which pick up one row at a time, does some work (takes roughly one second each). While picking up the row, each thread updates a flag on the database row (like a timestamp) so that the other threads do not pick it up. But the problem is that I end up in a scenario where multiple threads are picking up the same row.
My general question is that what general design approach should I follow here to ensure that each thread picks up unique rows and does their task independently.
Note : Multiple threads are running in parallel to hasten the processing of the database rows. So I would like to have a as small as possible critical segment or exclusive lock.
Just to give some context, below is the stored proc which picks up the rows from the table after it has updated the flag on the row. Please note that the stored proc is not compilable as I have removed unnecessary portions from it. But generally that's the structure of it.
The problem happens when multiple threads execute the stored proc in parallel. The change made by the update statement (note that the update is done after taking up a lock) in one thread is not visible to the other thread unless the transaction is committed. And as there is a SELECT statement (which takes around 50ms) between the UPDATE and the TRANSACTION COMMIT, on 20% cases the UPDATE statement in a thread picks up a row which has already been processed.
I hope I am clear enough here.
USE ['mydatabase']
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[GetRequest]
AS
BEGIN
-- some variable declaration here
BEGIN TRANSACTION
-- check if there are blocking rows in the request table
-- FM: Remove records that don't qualify for operation.
-- delete operation on the table to remove rows we don't want to process
delete FROM request where somecondition = 1
-- Identify the requests to process
DECLARE #TmpTableVar table(TmpRequestId int NULL);
UPDATE TOP(1) request
WITH (ROWLOCK)
SET Lock = DateAdd(mi, 5, GETDATE())
OUTPUT INSERTED.ID INTO #TmpTableVar
FROM request tur
WHERE (Lock IS NULL OR GETDATE() > Lock) -- not locked or lock expired
AND GETDATE() > NextRetry -- next in the queue
IF(##RowCount = 0)
BEGIN
ROLLBACK TRANSACTION
RETURN
END
select #RequestID = TmpRequestId from #TmpTableVar
-- Get details about the request that has been just updated
SELECT somerows
FROM request
WHERE somecondition = 1
COMMIT TRANSACTION
END
The analog of a critical section in SQL Server is sp_getapplock, which is simple to use. Alternatively you can SELECT the row to update with (UPDLOCK,READPAST,ROWLOCK) table hints. Both of these require a multi-statement transaction to control the duration of the exclusive locking.
You need start a transaction isolation level on sql for isolation your line, but this can impact on your performance.
Look the sample:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
GO
BEGIN TRANSACTION
GO
SELECT ID, NAME, FLAG FROM SAMPLE_TABLE WHERE FLAG=0
GO
UPDATE SAMPLE_TABLE SET FLAG=1 WHERE ID=1
GO
COMMIT TRANSACTION
Finishing, not exist a better way for use isolation level. You need analyze the positive and negative point for each level isolation and test your system performance.
More information:
https://learn.microsoft.com/en-us/sql/t-sql/statements/set-transaction-isolation-level-transact-sql
http://www.besttechtools.com/articles/article/sql-server-isolation-levels-by-example
https://en.wikipedia.org/wiki/Isolation_(database_systems)
EDITED: I have a table with composite key which is being used by multiple windows services deployed on multiple servers.
Columns:
UserId (int) [CompositeKey],
CheckinTimestamp (bigint) [CompositeKey],
Status (tinyint)
There will be continuous insertion in this table. I want my windows service to select top 10000 rows and do some processing while locking those 10000 rows only. I am using ROWLOCK for this using below stored procedure:
ALTER PROCEDURE LockMonitoringSession
AS
BEGIN
BEGIN TRANSACTION
SELECT TOP 10000 * INTO #TempMonitoringSession FROM dbo.MonitoringSession WITH (ROWLOCK) WHERE [Status] = 0 ORDER BY UserId
DECLARE #UserId INT
DECLARE #CheckinTimestamp BIGINT
DECLARE SessionCursor CURSOR FOR SELECT UserId, CheckinTimestamp FROM #TempMonitoringSession
OPEN SessionCursor
FETCH NEXT FROM SessionCursor INTO #UserId, #CheckinTimestamp
WHILE ##FETCH_STATUS = 0
BEGIN
UPDATE dbo.MonitoringSession SET [Status] = 1 WHERE UserId = #UserId AND CheckinTimestamp = #CheckinTimestamp
FETCH NEXT FROM SessionCursor INTO #UserId, #CheckinTimestamp
END
CLOSE SessionCursor
DEALLOCATE SessionCursor
SELECT * FROM #TempMonitoringSession
DROP TABLE #TempMonitoringSession
COMMIT TRANSACTION
END
But by doing so, dbo.MonitoringSession is being locked permanently until the stored procedure ends. I am not sure what I am doing wrong here.
The only purpose of this stored procedure is to select and update 10000 recent rows without any primary key and ensuring that whole table is not locked because multiple windows services are accessing this table.
Thanks in advance for any help.
(not an answer but too long for comment)
The purpose description should be about why/what for are you updating whole table. Your SP is for updating all rows with Status=0 to set Status=1. So when one of your services decides to run this SP - all rows become non-relevant. I mean, logically event which causes status change already occurred, you just need some time to physically change it in the database. So why do you want other services to read non-relevant rows? Ok, probably you need to read rows available to read (not changed) - but it's not clear again because you are updating whole table.
You may use READPAST hint to skip locked rows and you need rowlocks for that.
Ok, but even with processing of top N rows update of those N rows with one statement would be much faster then looping through this number of rows. You are doing same job but manually.
Check out example of combining UPDLOCK + READPAST to process same queue with parallel processes: https://www.mssqltips.com/sqlservertip/1257/processing-data-queues-in-sql-server-with-readpast-and-updlock/
Small hint - CURSOR STATIC, READONLY, FORWARD_ONLY would do same thing as storing to temp table. Review STATIC option:
https://msdn.microsoft.com/en-us/library/ms180169.aspx
Another thing is a suggestion to think of RCSI. This will avoid other services locking for sure but this is a db-level option so you'll have to test all your functionality. Most of it will work same as before but some scenarios need testing (concurrent transactions won't be locked in situations where they were locked before).
Not clear to me:
what is the percentage of 10000 out of the total number of rows?
is there a clustered index or this is a heap?
what is actual execution plan for select and update?
what are concurrent transactions: inserts or selects?
by the way discovered similar question:
why the entire table is locked while "with (rowlock)" is used in an update statement
We're currently working on the following process whose goal is to move data between 2 sets of database servers while maintaining FK's and handling the fact that the destination tables already have rows with overlapping identity column values:
Extract a set of rows from a "root" table and all of its children tables' FK associated data n-levels deep along with related rows that may reside in other databases on the same instance from the source database server.
Place that extracted data set into a set of staging tables on the destination database server.
Rekey the data in the staging tables by reserving block of identities for the destination tables and update all related child staging tables (each of these staging tables will have the same schema as the source/destination table with the addition of a "lNewIdentityID" column).
Insert the data with its new identity into the destination tables in correct order (option SET IDENTITY_INSERT 'desttable' ON will be used obviously).
I'm struggling with the block reservation portion of this process (#3). Our system is pretty much a 24 hour system except for a short weekly maintenance window. Management needs this process to NOT have to wait each week for the maintenance window to migrate data between servers. That being said, I may have 100 insert transactions competing with our migration process while it is on #3. Below is my wag at an attempt to reserve the block of identities, but I'm worried that between "SET #newIdent..." and "DBCC CHECKIDENT..." that an insert transaction will complete and the migration process won't have a "clean" block of identities in a known range that it can use to rekey the staging data.
I essentially need to lock the table, get the current identity, increase the identity, and then unlock the table. I don't know how to do that in T-SQL and am looking for ideas. Thank you.
IF EXISTS (SELECT TOP 1 1 FROM sys.procedures WHERE [name]='DataMigration_ReserveBlock')
DROP PROC DataMigration_ReserveBlock
GO
CREATE PROC DataMigration_ReserveBlock (
#tableName varchar(100),
#blockSize int
)
AS
BEGIN
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
DECLARE #newIdent bigint;
SET #newIdent = #blockSize + IDENT_CURRENT(#tableName);
DBCC CHECKIDENT (#tableName, RESEED, #newIdent);
SELECT #newIdent AS NewIdentity;
END
GO
DataMigration_ReserveBlock 'tblAddress', 1234
You could wrap it in a transaction
BEGIN TRANSACTION
...
COMMIT
It should be fast enough to not cause problems with your other insert processes. Though it would be a good idea to include try / catch logic so you could rollback if problems do occur.