SQL solve Left join issue without Repeatable Read or TABLOCKX [duplicate] - sql-server

I have a problem where I insert User and Address in a transaction with a 10 second delay and if run my select statement during the execution of the transaction it will wait for transaction to finish but I will get a null on the join. Why don't my select wait for both User/Address data to be committed.
If I run the select statement after the transaction is finish I will get the correct result. Why do i get this error and what is the generic solution to make this work
BEGIN TRANSACTION
insert into user(dummy) values('text')
WAITFOR DELAY '00:00:10';
insert into address(ID_FK) values((SELECT SCOPE_IDENTITY()))
COMMIT TRANSACTION
Running during transaction result in null in join
select * from user u left join address a on u.id = a.ID_FK order by id desc
| ID | dummy | ID_FK |
| 101 | 'text' | null |
Running after transaction result in correct result
select * from user u left join address a on u.id = a.ID_FK order by id desc
| ID | dummy | ID_FK|
| 101 | 'text' | 101 |

This type of thing is entirely possible at default read committed level for on premise SQL Server as that uses read committed locking. It is then execution plan dependent what will happen.
An example is below
CREATE TABLE [user]
(
id INT IDENTITY PRIMARY KEY,
dummy VARCHAR(10)
);
CREATE TABLE [address]
(
ID_FK INT REFERENCES [user](id),
addr VARCHAR(30)
);
Connection One
BEGIN TRANSACTION
INSERT INTO [user]
(dummy)
VALUES ('text')
WAITFOR DELAY '00:00:20';
INSERT INTO address
(ID_FK,
addr)
VALUES (SCOPE_IDENTITY(),
'Address Line 1')
COMMIT TRANSACTION
Connection Two (run this whilst connection one is waiting the 20 seconds)
SELECT *
FROM [user] u
LEFT JOIN [address] a
ON u.id = a.ID_FK
ORDER BY id DESC
OPTION (MERGE JOIN)
Returns
id
dummy
ID_FK
addr
1
text
NULL
NULL
The execution plan is as follows
The scan on User is blocked by the open transaction in Connection 1 that has inserted the row there. This has to wait until that transaction commits and then eventually gets to read the newly inserted row.
Meanwhile the Sort operator has already requested the rows from address by this point as it consumes all its rows in its Open method (i.e. during operator initialisation). This is not blocked as no row has been inserted to address yet. It reads 0 rows from address which explains the final result.
If you switch to using read committed snapshot rather than read committed locking you won't get this issue as it will only read the committed state at the start of the statement so it isn't possible to get this kind of anomaly.

Related

SQL join into a Recursive CTE with parameters

I am trying to use an SQL Query to create some client side reporting for my company. There exists 3 tables that I would like to join together. One of the tables may require a CTE as I need to recursively go through a table and return a row. Here is how the tables are structured (simply).
I want a output table that, for each WorkOrder, displays the most recently completed task in DataCollection (including its time) and the next Op in the TaskListing. I figured a CTE maybe is the only way to recursively go through each row and determine what task is next. (By checking if the completed Op exists in PreOp column). If the completed cell doesn't exist as a preOp it should default to the MAX(Op) (the last task).
CREATE TABLE [dbo].[WorkOrder](
[WorkOrderID][int] NOT NULL PRIMARY KEY,
[Column1] [nvarchar](20),
[Column2] [nvarchar](20)
)
INSERT INTO WorkOrder VALUES(1,'x','y');
INSERT INTO WorkOrder VALUES(2,'x','y');
INSERT INTO WorkOrder VALUES(3,'x2','y2');
CREATE TABLE [dbo].[DataCollection](
[DataCollection][int] NOT NULL PRIMARY KEY,
[WorkOrderID][int] NOT NULL FOREIGN KEY REFERENCES WorkOrder(WorkOrderID),
[CellTask] [nvarchar](20),
[TimeCompleted] [DateTime]
)
INSERT INTO DataCollection VALUES(1,1,'cella','2016-08-09 00:00:00');
INSERT INTO DataCollection VALUES(2,1,'cellb','2016-08-10 00:00:00');
INSERT INTO DataCollection VALUES(3,1,'cellc','2016-08-11 00:00:00');
INSERT INTO DataCollection VALUES(4,2,'cella','2016-08-09 00:00:00');
INSERT INTO DataCollection VALUES(5,2,'cellb','2016-08-10 00:00:00');
CREATE TABLE [dbo].[TaskListing](
[TaskListingID][int] NOT NULL PRIMARY KEY,
[WorkOrderID][int] NOT NULL FOREIGN KEY REFERENCES WorkOrder(WorkOrderID),
[Op][nvarchar](20) NOT NULL,
[preOP][nvarchar](20),
[CellTask][nvarchar](20) NOT NULL,
[Completed][bit] NOT NULL
)
INSERT INTO TaskListing VALUES(1,1,'10',NULL,'cella',0);
INSERT INTO TaskListing VALUES(2,1,'20','10','cellb',0);
INSERT INTO TaskListing VALUES(3,1,'30',NULL,'cellc',1);
INSERT INTO TaskListing VALUES(4,1,'40','10,30','celld',0);
INSERT INTO TaskListing VALUES(5,2,'10',NULL,'cella',1);
INSERT INTO TaskListing VALUES(6,2,'20','10','cellb',1);
INSERT INTO TaskListing VALUES(7,2,'30','20','cellc',0);
The Output table will represent, for each WorkOrder, the most recently completed cell (from the DataCollection Table, TimeCompleted column) & The next cell in the Work Flow (by looking at the rows on the TaskListing Table for the given WorkOrderID and looking for a row that contain the completed task as a 'PreOp'). If it can't find the completed task as a preOp for any other row it should default to the last task.
The part of the Query I'm having most trouble with is filling in the NextTaskCell column. I need to write a query that looks at all the tasks for a given WorkOrderID (In the TaskListing Table) and based on the completed task, determine what is the next task. I'm finding it difficult to feed in both a WorkOrderID & CellTask then find an instance of itself in the PreOp column.
Output Table
+-------------+-------------------+---------------------+--------------+
| WorkOrderId | LastCompletedCell | CompletedOn | NextTaskCell |
|(WorkOrder) | (DataCollection) | (DataCollection) |(TaskListing) |
+-------------+-------------------+---------------------+--------------+
| 1 | cellc | 2016-08-11 00:00:00 | celld |
| 2 | cellb | 2016-08-10 00:00:00 | cellc |
+-------------+-------------------+---------------------+--------------+
I thank you in advance for your time. If there is any other questions please let me know and I'll try to answer them.
Link to SQL Fiddle SQL Fiddle
The following query gives you the expected output you have in your question. You should test this query against a larger dataset to make sure it is correct in all cases.
;WITH
mtc AS ( -- most recent completion date/time for a work order
SELECT
dc.WorkOrderID,
TimeCompleted=MAX(dc.TimeCompleted)
FROM
DataCollection AS dc
GROUP BY
dc.WorkOrderID
),
lop AS ( -- last operation for work order
SELECT
tl.WorkOrderID,
LastOp=MAX(CAST(tl.Op AS INT))
FROM
TaskListing AS tl
GROUP BY
tl.WorkOrderID
)
SELECT
mtc.WorkOrderID,
LastCompletedCell=dc.CellTask,
CompletedOn=dc.TimeCompleted,
NextTaskCell=ISNULL(tl_next.CellTask,tl_last.CellTask)
FROM
mtc
INNER JOIN DataCollection AS dc ON -- the last completed CellTask
dc.WorkOrderID=mtc.WorkOrderID AND
dc.TimeCompleted=mtc.TimeCompleted
INNER JOIN TaskListing AS tl ON -- Op for CellTask
tl.WorkOrderID=mtc.WorkOrderID AND
tl.CellTask=dc.CellTask
INNER JOIN lop ON
lop.WorkOrderID=mtc.WorkOrderID
INNER JOIN TaskListing AS tl_last ON -- CellTask for last Op
tl_last.WorkOrderID=mtc.WorkOrderID AND
tl_last.Op=lop.LastOp
LEFT JOIN TaskListing AS tl_next ON -- Look for next CellTask where Op is a PreOp of another CellTask
tl_next.WorkOrderID=mtc.WorkOrderID AND
','+tl_next.preOP+',' LIKE '%,'+tl.Op+',%'
ORDER BY
mtc.WorkOrderId;
Note: It is a bad idea to store PreOps as a comma-separated string. This is not how you should store data in relational databases. When you do, you will have to resort to more complex and less efficient queries. To wit, see the join condition in tl_next.
Instead you should have a table to store PreOps as separate rows, linking to the parent Op that depends on it.

insert into 2 joined tables in a single effecient query with mssql

I am trying to insert some data into 2 different tables in my ms sql database (ms sql server 2012).
Consider the following 2 tables, that contains information about polls and their choices.
+----------------+
| Polls |
+----------------+
| pollID | The id of a poll (auto increment)
| memberID | The id of the member, owning the poll
| pollTitle | The title/question of the poll
| date | The date of the poll
+----------------+
+----------------+
| PollChoices |
+----------------+
| pollChoiceID | The id of a poll choice (auto increment)
| pollID | The id of the poll, that include this choice
| pollChoice | The name/title of the poll choice
+----------------+
How can I make a query, inserting the data in the most effecient way?
I can always make 2 queries, but can't really figure out how to do it with a single one.
Óne of the main issues for me in this case is getting the id of the poll when inserting the pollchoices.. how can i get this newly inserted "pollID" (auto increment) and use it in the same query?
Also, is it neccesary to make use of transactions or stored procedures (I've read this somewhere)?
Any help will be greatly appreciated.
I don't think what you are aking is possible.
The best way I think to perform the operation you describe, is to wrap it inside a stored procedure. You can use SCOPE_IDENTITY to get the ID of the previously added record and a TRANSACTION together with a TRY-CATCH block to ensure that both insert queries are executed or none at all.
CREATE PROCEDURE [dbo].[usp_InsertPoll] (
-- sproc declaration here, including the following parameters:
-- #memberID, #pollTitle, #date, #pollChoiceID, #pollChoice
)
AS BEGIN
DECLARE #pollID INT
BEGIN TRANSACTION;
BEGIN TRY
INSERT INTO Polls (memberID, pollTitle, date)
VALUES (#memberID, #pollTitle, #date)
-- Get the last identity value inserted into an identity column in the same scope
SET #pollID = SCOPE_IDENTITY();
INSERT INTO PollChoices(pollChoiceID, pollID, pollChoice )
VALUES (#pollChoiceID, #pollID, #pollChoice)
END TRY
BEGIN CATCH
IF ##TRANCOUNT > 0
ROLLBACK TRANSACTION;
END CATCH;
IF ##TRANCOUNT > 0
COMMIT TRANSACTION;
END
Try this query
declare #npi int
INSERT INTO Polls (memberID, pollTitle, date)
VALUES (#memberID, #pollTitle, #date)
set #npi=
(select top 1 PollId
from Polls
order by PollId desc)
INSERT INTO PollChoices(pollChoiceID, pollID, pollChoice )
VALUES (#pollChoiceID, #npi, #pollChoice)

SQL Server: A severe error occurred on the current command. The results, if any, should be discarded

I have the following SQL Server Query in a stored procedure and I am running this service from a windows application. I am populating the temp table variable with 30 million records and then comparing them with previous days records in tbl_ref_test_main to Add add and delete the different records. there is a trigger on tbl_ref_test_main on insert and delete. Trigger write the same record in another table. Because of the comparison of 30 million records its taking ages to produce the result and throws and error saying A severe error occurred on the current command. The results, if any, should be discarded.
Any suggestions please.
Thanks in advance.
-- Declare table variable to store the records from CRM database
DECLARE #recordsToUpload TABLE(ClassId NVARCHAR(100), Test_OrdID NVARCHAR(100),Test_RefId NVARCHAR(100),RefCode NVARCHAR(100));
-- Populate the temp table
INSERT INTO #recordsToUpload
SELECT
class.classid AS ClassId,
class.Test_OrdID AS Test_OrdID ,
CAST(ref.test_RefId AS VARCHAR(100)) AS Test_RefId,
ref.ecr_RefCode AS RefCode
FROM Dev_MSCRM.dbo.Class AS class
LEFT JOIN Dev_MSCRM.dbo.test_ref_class refClass ON refClass.classid = class.classid
LEFT JOIN Dev_MSCRM.dbo.test_ref ref ON refClass.test_RefId = ref.test_RefId
WHERE class.StateCode = 0
AND (ref.ecr_RefCode IS NULL OR (ref.statecode = 0 AND LEN(ref.ecr_RefCode )<= 18 ))
AND LEN(class.Test_OrdID )= 12
AND ((ref.ecr_RefCode IS NULL AND ref.test_RefId IS NULL)
OR (ref.ecr_RefCode IS NOT NULL AND ref.test_RefId IS NOT NULL));
-- Insert new records to Main table
INSERT INTO dbo.tbl_ref_test_main
Select * from #recordsToUpload
EXCEPT
SELECT * FROM dbo.tbl_ref_test_main;
-- Delete records from main table where similar records does not exist in temp table
DELETE P FROM dbo.tbl_ref_test_main AS P
WHERE EXISTS
(SELECT P.*
EXCEPT
SELECT * FROM #recordsToUpload);
-- Select and return the records to upload
SELECT Test_OrdID,
CASE
WHEN RefCode IS NULL THEN 'NA'
ELSE RefCode
END,
Operation AS 'Operation'
FROM tbl_daily_upload_records
ORDER BY Test_OrdID, Operation, RefCode;
My suggestion would be that 30 million rows is too large for the table variable, try creating a temporary table, populating it with the data and then performing the analysis there.
If this isn't possible/suitable then perhaps create a permanent table and truncating it between uses.

one to one parent child relationship sql server

I have a table with fields TransactionID, Amount and ParentTransactionID
The transactions can be cancelled so a new entry posted with amount and ParentTransactionID as cancelled TransactionID.
Lets say a transaction
1 100 NULL
I cancelled the above entry, it will like
2 -100 1
Again cancelled the above transaction, so it should like
3 100 2
When I fetch I should get the record 3 as ID 1 and 2 got cancelled.
result should be
3 100 2
If I cancelled the 3rd entry no records should return.
SELECT * FROM Transaction t
WHERE NOT EXISTS (SELECT TOP 1 NULL FROM Transaction pt
WHERE (pt.ParentTransactionID = t.TransactionID OR t.ParentTransactionID = pt.TransactionID)
AND ABS(t.Amount) = ABS(pt.Amount))
This works if only one level of cancel is made.
If all transactions are cancelled by a new transaction setting ParentTransactionId to the transaction it cancels, it can be done using a simple LEFT JOIN;
SELECT t1.* FROM Transactions t1
LEFT JOIN Transactions t2
ON t1.TransactionId = t2.ParentTransactionId
WHERE t2.TransactionId IS NULL;
t1 being the transaction we're currently looking at and t2 being the possibly cancelling transaction. If there is no cancelling transaction (ie the TransactionId for t2 does not exist), return the row.
I'm not sure about your last statement though, If I cancelled the 3rd entry no records should return.. How would you cancel #3 without adding a new transaction to the table? You may have some other condition for a cancel you're not telling us about...?
Simple SQLfiddle demo.
EDIT: Since you don't want cancelled transactions (or rather transactions with an odd number of cancellations), you need a quite a bit more complicated recursive query to figure out whether to show the last transaction or not;
WITH ChangeLog(TransactionID, Amount, ParentTransactionID,
IsCancel, OriginalTransactionID) AS
(
SELECT TransactionID, Amount, ParentTransactionID, 0, TransactionID
FROM Transactions WHERE ParentTransactionID IS NULL
UNION ALL
SELECT t.TransactionID, t.Amount, t.ParentTransactionID,
1-c.IsCancel, c.OriginalTransactionID
FROM Transactions t
JOIN ChangeLog c ON c.TransactionID = t.ParentTransactionID
)
SELECT c1.TransactionID, c1.Amount, c1.ParentTransactionID
FROM ChangeLog c1
LEFT JOIN ChangeLog c2
ON c1.TransactionID < c2.TransactionID
AND c1.OriginalTransactionID = c2.OriginalTransactionID
WHERE c2.TransactionID IS NULL AND c1.IsCancel=0
This will, in your example with 3 transactions, show the last row, but if the last row is cancelled, it won't return anything.
Since SQLfiddle is up again, here is a fiddle to test with.
A short explanation of the query may be in order even if a bit hard to do in a simple way; it defines a recursive "view", ChangeLog that tracks cancels and the original transaction id from the original to the last transaction in a series (a series is all transactions with the same OriginalTransactionId). After that, it joins ChangeLog with itself to find the last entry (ie all transactions that don't have a cancelling transaction). If the last entry found in a series is not a cancel (IsCancel=0) it will show up.

sql server deadlock case

I have a deadlock problem between 2 processes that insert data in the same table
These 2 processes run exactly the same SQL orders on a table with a primary key (identity) and a unique index.
the sequence of SQL order is the following, for each process in an explicit transaction :
begin trans
select CUSTID from CUSTOMERS where CUSTNUMBER='unique value'
------- the row is never found in this case so... insert the data
insert into CUST(CUSTNUMBER) values('unique value')
------- then we must read the value generated for the pk
select CUSTID from CUSTOMERS where CUSTNUMBER='unique value'
commit
each process work on a distinct data set and have no common values for "CUSTNUMBER"
the deadlock occurs in this case :
spid 1 : select custid... for unique value 1
spid 2 : select custid... for unique value 2
spid 1 : insert unique value 1
spid 2 : insert unique value 2
spid 2 : select custid again for value 2 <--- Deadlock Victim !
spid 1 : select custid again for value 1
The deadlock graph show that the problem occurs on the unique index on CUSTNUMBER
The killed process had a lock OwnerMode:X and was RequestMode:S on the unique index for the same HoBt ID.
The winner process was OnwerMode:X and RequestMode:S for the same HoBt ID
I have no idea to explain that, maybe someone can help me ?
try using OUTPUT to get rid of the final SELECT:
begin trans
select CUSTID from CUSTOMERS where CUSTNUMBER='unique value'
------- the row is never found in this case so... insert the data
insert into CUST(CUSTNUMBER) OUTPUT INSERTED.CUSTID values('unique value')
--^^^^^^^^^^^^^^^ will return a result set of CUSTIDs
commit
OR
DECLARE #x table (CUSTID int)
begin trans
select CUSTID from CUSTOMERS where CUSTNUMBER='unique value'
------- the row is never found in this case so... insert the data
insert into CUST(CUSTNUMBER) OUTPUT INSERTED.CUSTID INTO #x values('unique valu')
--^^^^^^^^^^^^^^^^^^^^^^ will store a set of CUSTIDs
-- into the #x table variable
commit
I have no explanation to the deadlock only another way of doing what you are doing using merge and output. It requires that you use SQL Server 2008 (or higher). Perhaps it will take care of your deadlock issue.
declare #dummy int;
merge CUSTOMERS as T
using (select 'unique value') as S(CUSTNUMBER)
on T.CUSTNUMBER = S.CUSTNUMBER
when not matched then
insert (CUSTNUMBER) values(S.CUSTNUMBER)
when matched then
update set #dummy = 1
output INSERTED.CUSTID;
This will return the newly created CUSTID if there was no match and the already existing CUSTID if there where a match for CUSTNUMBER.
It would be best if you post the actual deadlock graph (the .xml file, not the picture!). W/o that noone can be sure, but is likely that you see a case of the read-write deadlock that occurs due to the order of using vs. applying updates to the secondary indexes. I cannot reommend a solution w/o seeing the deadlock graph and the exact table schema (clustered index and all non-clustered indexes).
On a separate note the SELECT->if not exists->INSERT pattern is always wrong under concurrency, there isn't anything to prevent two threads from trying to insert the same row. A much better patter is to simply insert always and catch the duplicate key violation exception that occurs (is also more performant). As for your second SELECT, use OUTPUT clause as other have already suggested. so basically this whole ordeal can be reduced an insert int a try/catch block. MERGE will also work.
An alternative to using output is replacing the last select with a select scope_identity() if the CUSTID column is an identity column.

Resources