Sql Server: Apply row lock excluding already locked rows

Sql Server: Apply row lock excluding already locked rows - sql-server

I will explain my scenario with an example.
I have multiple rows in my table. I am picking those one by one for processing. I need to lock the row for processing.
Sample code looks like,
select top 1 * into #open_order from orders with (xlock)
where status = 'open' order by order_time;
EDIT: Added order by clause in the query.
My requirement is to run this in parallel connections. My problem here is, I cannot run this code on multiple connection in parallel. The second one waits until the first one commit the transaction.
Is there any way to exclude already locked rows from this select query?
I have come across with(readpast). But I don't know whether it can be used together or not.
EDIT: Sample data and expectation.
Orders table data:
id, order_time, status, remark
1, 2019-01-01 00:00:01, 'open', 'Sample 1'
2, 2019-01-02 00:00:01, 'open', 'Sample 2'
3, 2019-01-03 00:00:01, 'open', 'Sample 1'
If first row is locked, I am expecting to get the second row as result of the query.

Related

Question on query optimization in Oracle database - fetch first 1 rows only

I have a cursor in oracle database which would be fetching thousands of rows in a sorted manner but I would actually need only the first row(i.e., oldest one first). The loop is designed in such a way that it processes first one row and comes out. And then the cursor is opened again to fetch the remaining rows. My question is if I use 'fetch first 1 rows only' in the cursor, could it really help improve performance?
Basically I want to know which is more efficient in terms of performance among the below:
Query 1:
select a.col1,a.col2,a.col3,a.rowid rid,a.col4
from table1 a, table2 b
where a.status = 'N'
and b.col1 = 1
and b.col2 = a.col5
order by insert_time;
Query 2:
select a.col1,a.col2,a.col3,a.rowid rid,a.col4
from table1 a, table2 b
where a.status = 'N'
and b.col1 = 1
and b.col2 = a.col5
order by insert_time
fetch first 1 rows only;

Letting the database know your "intentions" (eg "I only want the first x rows") can be critical to performance. For example, normal sorting operations store the entire result set in memory or on disk in temporary tablespace. But with the FETCH clause Oracle knows it only has to track the Top N rows and can use significantly less memory for sorting.
Here's a complete video walkthrough of why including demos and the impact on response time, memory and performance.
https://www.youtube.com/watch?v=rhOVF82KY7E

cursor processing is a slow process, if you can do in SQL instead of using a cursor then try to use SQL to process the data.
What happens if you have more than one row to process, will you still go and process the row by going through the cursor more than one time ?
thanks

update with rowlock in MSSQL server

I was trying to understand ROWLOCK in SQL server to update a record after locking it. Here is my observation and would like to get a confirm if ROWLOCK is like a table or page lock sort of thing or I have not tried it correctly. ROWLOCK should be a lock to row only not to the table or page.
Here is what I tried:
I created a simple table:row_lock_temp_test with two columns ID and Name with no PK or index. Now I open SQL Server, two different clients but same credentials and tried executed a set of queries as follow:
Client 1:
1: BEGIN TRANSACTION;
2: update row_lock_temp_test set name = 'CC' where id = 2
3: COMMIT
Client 2:
1: BEGIN TRANSACTION;
2: update row_lock_temp_test set name= 'CC' where id = 2
3: COMMIT
I executed Query 1, 2 on C-1 and went to C-2 and executed the same queries, both clients executed the queries and then I committed the transaction, all good.
Then I added RowLock to update query,
C-1
1: BEGIN TRANSACTION;
2: update row_lock_temp_test WITH(rowlock) set name = 'CC' where id = 2
3: COMMIT
C-2
1: BEGIN TRANSACTION;
2: update row_lock_temp_test WITH(rowlock) set name = 'CC' where id = 2
3: COMMIT
Now, I executed query 1 and 2 on C-1 and then went to C-2 and tried to execute the same 2 queries, but query got Stuck as expected because the row is locked by C-1 so it should be in queue until the transaction is committed on C-1. as soon as I committed transaction on C-1 query on C-2 got executed and then I committed the transaction on C-2 as well. All good.
here I tried another scenario to execute the same set of queries with row id = 3
C-2
1: BEGIN TRANSACTION;
2: update row_lock_temp_test WITH(rowlock) set name = 'CC' where id = 3
3: COMMIT
I executed 1st two queries in C-1 and then went to executed 1st two queries of C-2, row id is different in both clients, but still, the query in C-2 got stuck. This means while updating query with id = 2 it has locked the page or table, I was expecting a row lock, but it seems a page or table lock.
I also tried using xlock, HOLDLOCK, and UPDLOCK with different combinations but it is always locking the table. is there any possibility to lock a row only.
Select and insert is working as expected.
Thanks in advance.

Lock hints are only hints. You can't "force" SQL to take a particular kind of lock.
You can see the locks being taken with the following query:
select tl.request_session_id,
tl.resource_type,
tl.request_mode,
tl.resource_description,
tl.request_status
from sys.dm_tran_locks tl
join sys.partitions pt on pt.hobt_id = tl.resource_associated_entity_id
join sys.objects ob on ob.object_id = pt.object_id
where tl.resource_database_id = db_id()
order by tl.request_session_id
OK, let's run some code in an SSMS query window:
create table t(i int, j int);
insert t values (1, 1), (2, 2);
begin tran;
update t with(rowlock) set j = 2 where i = 1;
Open a second SSMS window, and run this:
begin tran;
update t with(rowlock) set j = 2 where i = 2;
The second execution will be blocked. Why?
Run the locking query in a third window, and note that there are two rows with a resource_type of RID, one with a status of "grant", the other with a status of "wait". We'll get to the RID bit in a second. Also, look at the resource_description column for those rows. It's the same value.
OK, so what's a resource_description? It depends on theresource_type. But for our RID it represents: the file id, then the page id, then the row id (also known as the slot). But why are both executions taking a lock on row slot 0? Shouldn't they be trying to lock different rows? After all, we are updating different rows.
David Browne has given the answer: In order to find the correct row to update, SQL has to scan the entire table, because there is no index telling it how many rows there are where i = 1. It will take an update lock on each row as it scans through. Why does it take an update lock on each row? Well, it's not to "do" the update, to so speak. It will take an exclusive lock for that. Update locks are pretty much always taken to prevent deadlocks.
So, the first query has scanned through the rows, taking a U lock on each row. Of course, it found the row it wanted to update right away, in slot 0, and took an X lock. And it still has that X lock, because we haven't committed.
Then we started the second query, which also has to scan all of the rows to find the one it wants. It started off by trying to take the U lock on the first row, and was blocked. The X lock of our first query is blocking it.
So, you see, even with row locking, your second query is still blocked.
OK, let's rollback the queries, and see what happens if we have the first query update the second row, and the second query update the first row? Does that work? Nope! Because SQL still has no way of knowing how many rows match the predicate. So the first query takes its update lock on slot 0, sees that it doesn't have to update it, takes its update lock on slot 1, sees the correct value for i, takes its exclusive lock, and waits for us to commit.
The query 2 comes along, takes the update lock on slot 0, sees the value it wants, takes its exclusive lock, updates the value, and then tries to take an update lock on slot 1, because that might also have the value it wants.
You'll also see "intent locks" on the next "level" up, i.e., the page. The operation is letting the rest of the engine know that it might want to escalate the lock to the page level at some point in the future. But that's not a factor here. Page locking is not causing the issue.
Solution in this case? Add an index on column i. In this case, that's probably the primary key. You can then do the updates in either order. Asking for row locking in this case makes no difference, because SQL doesn't know how many rows match the predicate. But even if you try to force a row lock in some situation, and even with a primary key or appropriate index, SQL can still choose to escalate the lock type, because it can be way more efficient to lock a whole page, or a whole table, than to lock and unlock individual rows.

"Subquery returned more than 1 value" when deleting records, not not if I change the number fetched

I am trying to delete millions of records from 4 databases, and running into an unexpected error. I made a temp table that holds a list of all the id's I wish to delete:
CREATE TABLE #CaseList (case_id int)
INSERT INTO #CaseList
SELECT DISTINCT id
FROM my_table
WHERE <my criteria for choosing cases>
I have deleted all the associated records (with foreign key on case_id)
DELETE FROM image WHERE case_id in (SELECT case_id from #CaseList)
Then I'm deleting records from my_table in batches (so as not to blow up the transaction log - which despite my database being in Simple Mode - still grows when making changes like deletions):
DELETE FROM my_table WHERE id in (SELECT case_id
FROM #CaseList
ORDER by case_id
OFFSET 0 ROWS
FETCH NEXT 10000 ROWS ONLY)
This will work fine for one or three or five rounds (so I've deleted 10k-50k records), then will fail with this error message:
Msg 512, Level 16, State 1, Procedure trgd_image, Line 188
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
Which is really weird because as I said, I already deleted all the associated records from the image table. Then it gets weirder because if I select smaller batches, the deletion works without error.
I generally cut the FETCH NEXT n half (5k), then in half again (2500), then in half again (1200) etc. until it works
DELETE FROM my_table WHERE id in (SELECT case_id
FROM #CaseList
ORDER by case_id
OFFSET 50000 ROWS
FETCH NEXT 1200 ROWS ONLY)
Then repeat that amount until I get past where it failed, then turn it back up to 10000 and it will work again for a batch or three...
DELETE FROM my_table WHERE id in (SELECT case_id
FROM #CaseList
ORDER by case_id
OFFSET 60000 ROWS
FETCH NEXT 10000 ROWS ONLY)
then fail again with the same error... rinse, wash, and repeat.
What can cause that subquery error when there are NOT related records in the image table? Why would selecting the cases in smaller batches work "around it" and then allow larger batches again?
I would really love a solution to this so I can make a WHILE loop and run this deletion through the millions of rows that way instead of having to manage it manually which is going to take me weeks with millions of rows needed to be deleted out of 4 databases.

The query you're showing cannot produce the error you're seeing. If you're sure it is, you have a bug report. My guess is that trgd_image, Line 188 (or somewhere nearby) you'll find you're using a scalar comparison, = instead of in.
I also have some advice for you, free for the asking. I wrote lots of queries like yours, and never used anything like OFFSET 60000 ROWS FETCH NEXT 10000 ROWS ONLY. You don't need to, either, and your SQL will be easier to write if you don't.
First, unless your machine is seriously undersized for 2018 for the scale of data you're using, I think you'll find 100,000 row transactions are just fine. If not, at least try to understand why not. A machine managing many millions of rows ought to be able to deal with a 1% of them without breaking a sweat.
When you populate #CaseList, trap ##rowcount. Then you can print/record that, and compute the number of "chunks" in your work.
Ideally, though, there's no temporary table. Instead, those cases probably have some logical grouping you can operate on. They might have regions or owners or dates, whatever was used to select them in the first place. Iterate over that, e.g.
delete from T where id in (select id from S where user = 1
Once you do that, you can write a loop:
select #user = min(user) from S where ...
while #user is not NULL begin
print "deleting cases for user", #user
delete from T where id in (select id from S where user = #user)
select #u = #user
select #user = min(user) from S where ... and user > #u
end
That way, if the process blows up partway through -- for any reason -- you have a logical grouping of deletions and a clean break: you know all the cases for user (or whatever) less than #user are deleted, and you can look into what's wrong with the "current" one. Quite often, you'll discover that the problem isn't unique, and by solving it you'll prevent future problems with others.

SQL with JOIN and Group By Plus Average

I am looking to generate results of my shop efficiencies. To do this I will need data from 3 different tables JOMAST, JODRTG and INRTGS. So I have come up with the following Query:
select jomast.fpartno AS 'Part Number',
JODRTG.foperno AS 'OP Number',
JODRTG.fopermemo As ' Description',
JODRTG.fpro_id AS 'Work Center',
jodrtg.fprod_tim AS ' Act. Production Time',
inrtgs.fuprodtime AS 'Est. Prodution Time'
from jodrtg
Left join jomast on jodrtg.fjobno = jomast.fjobno
left join inrtgs on jomast.fpartno = inrtgs.fpartno
Now what I need to do is average out the Act. Production Time. And get things down to the Part Number with all of the OP numbers for that part.
When I try and Group By JOMAST.fpartno I get an error that it cannot use and outer column. If I do Group By on the Operation number, then all Operation 10's for every part will be combined. Which, is not the desired result.
Can someone please point me in the direction I need to go to achieve my result?

It sounds like you want to have multiple rows all showing the same jodrtg.fpartno, all showing different jodrtg.fprod_tim, but all showing the average (same) value of jodrtg.fprod_tim yes?
If so, add:
AVG(jodrtg.fprod_tim) OVER(PARTITION BY jodrtg.fpartno) as avg_for_all_fprod_tim
as a column in your list of columns you're selecting.
For more info you can google PARTITION BY, but think of it like an instruction to perform a GROUP BY x, AVG(y) (or sum, max, whatever) on the data, but then put that AVG into the row data for each row where x is present..
Essentially it's the same as writing a smaller subquery of (SELECT jodrtg.fpartno, avg(jodrtg.fprod_tim) as avg_for_all_frod_tim FROM blah GROUP BY jodrtg.fpartno) and then joining it back to your main query

one to one parent child relationship sql server

I have a table with fields TransactionID, Amount and ParentTransactionID
The transactions can be cancelled so a new entry posted with amount and ParentTransactionID as cancelled TransactionID.
Lets say a transaction
1 100 NULL
I cancelled the above entry, it will like
2 -100 1
Again cancelled the above transaction, so it should like
3 100 2
When I fetch I should get the record 3 as ID 1 and 2 got cancelled.
result should be
3 100 2
If I cancelled the 3rd entry no records should return.
SELECT * FROM Transaction t
WHERE NOT EXISTS (SELECT TOP 1 NULL FROM Transaction pt
WHERE (pt.ParentTransactionID = t.TransactionID OR t.ParentTransactionID = pt.TransactionID)
AND ABS(t.Amount) = ABS(pt.Amount))
This works if only one level of cancel is made.

If all transactions are cancelled by a new transaction setting ParentTransactionId to the transaction it cancels, it can be done using a simple LEFT JOIN;
SELECT t1.* FROM Transactions t1
LEFT JOIN Transactions t2
ON t1.TransactionId = t2.ParentTransactionId
WHERE t2.TransactionId IS NULL;
t1 being the transaction we're currently looking at and t2 being the possibly cancelling transaction. If there is no cancelling transaction (ie the TransactionId for t2 does not exist), return the row.
I'm not sure about your last statement though, If I cancelled the 3rd entry no records should return.. How would you cancel #3 without adding a new transaction to the table? You may have some other condition for a cancel you're not telling us about...?
Simple SQLfiddle demo.
EDIT: Since you don't want cancelled transactions (or rather transactions with an odd number of cancellations), you need a quite a bit more complicated recursive query to figure out whether to show the last transaction or not;
WITH ChangeLog(TransactionID, Amount, ParentTransactionID,
IsCancel, OriginalTransactionID) AS
(
SELECT TransactionID, Amount, ParentTransactionID, 0, TransactionID
FROM Transactions WHERE ParentTransactionID IS NULL
UNION ALL
SELECT t.TransactionID, t.Amount, t.ParentTransactionID,
1-c.IsCancel, c.OriginalTransactionID
FROM Transactions t
JOIN ChangeLog c ON c.TransactionID = t.ParentTransactionID
)
SELECT c1.TransactionID, c1.Amount, c1.ParentTransactionID
FROM ChangeLog c1
LEFT JOIN ChangeLog c2
ON c1.TransactionID < c2.TransactionID
AND c1.OriginalTransactionID = c2.OriginalTransactionID
WHERE c2.TransactionID IS NULL AND c1.IsCancel=0
This will, in your example with 3 transactions, show the last row, but if the last row is cancelled, it won't return anything.
Since SQLfiddle is up again, here is a fiddle to test with.
A short explanation of the query may be in order even if a bit hard to do in a simple way; it defines a recursive "view", ChangeLog that tracks cancels and the original transaction id from the original to the last transaction in a series (a series is all transactions with the same OriginalTransactionId). After that, it joins ChangeLog with itself to find the last entry (ie all transactions that don't have a cancelling transaction). If the last entry found in a series is not a cancel (IsCancel=0) it will show up.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight