Duplicates not getting ignored in SQL Server - sql-server

I have a temp table that has two rows.
Their Id is 999359143, 999365081
I have a table that doesn't have a primary key but has a unique index based off of the id and date.
This 999359143 already exists in the table. So when I use my query it still is trying to insert the row from the temp table into the normal table and it errors. This is the query below
INSERT INTO [XferTable]
([DataDate]
,[LoanNum]
)
SELECT Distinct t1.[DataDate]
,t1.[LoanNum]
FROM #AllXfers t1 WITH(HOLDLOCK)
WHERE NOT EXISTS(SELECT t2.LoanNum, t2.DataDate
FROM XferTable t2 WITH(HOLDLOCK)
WHERE t2.LoanNum = t1.LoanNum AND t2.DataDate = t1.DataDate
)
Is there a better way to do this?

You should use the MERGE statement, which acts atomically so you shouldn't need to do your own locking (also, isolation query hints on temporary tables doesn't achieve anything).
MERGE XferTable AS SOURCE
USING #AllXfers AS TARGET
ON
SOURCE.[DataDate] = TARGET.[DataDate]
AND SOURCE.[LoanNum] = TARGET.[LoanNum]
WHEN NOT MATCHED BY TARGET--record in SOURCE but not in TARGET
THEN INSERT
(
[DataDate]
,[LoanNum]
)
VALUES
(
SOURCE.[DataDate]
,TARGET.[LoanNum]
);
Your primary key violation is probably because you are using (Date, Loan#) as the uniqueness criteria and your primary key is probably only on Loan#.

Related

What is the best way to assert that a set of columns could form a primary key in Snowflake?

Infamously primary key constraints are not enforced in snowflake sql:
-- Generating a table with 4 rows that contain duplicates and NULLs:
CREATE OR REPLACE TEMP TABLE PRIMARY_KEY_TEST AS
SELECT
*
FROM (
SELECT 1 AS PK, 'TEST_TEXT' AS TEXT
UNION ALL SELECT 1 AS PK, 'TEST_TEXT' AS TEXT
UNION ALL SELECT NULL AS PK, NULL AS TEXT
UNION ALL SELECT NULL AS PK, NULL AS TEXT
)
;
SELECT *
FROM PRIMARY_KEY_TEST
;
PK
TEXT
1
TEST_TEXT
1
TEST_TEXT
NULL
NULL
NULL
NULL
-- These constraints will NOT throw any errors in Snowflake
ALTER TABLE PRIMARY_KEY_TEST ADD PRIMARY KEY (PK);
ALTER TABLE PRIMARY_KEY_TEST ADD UNIQUE (TEXT);
However knowing that a set of colums has values that are uniuqe for every row and never NULL is vital to check when updating a set of data.
So I'm looking for a easy to write and read (ideally 1-2 lines) piece of code (proably based on some Snowflake function) that throws an error if a set of columns no longer forms a viable primary key in Snowflake SQL.
Any Suggestions?
So I'm looking for a easy to write and read (ideally 1-2 lines) piece of code (proably based on some Snowflake function) that throws an error if a set of columns no longer forms a viable primary key in Snowflake SQL
Such test query is easy to write using QUALIFY and windowed COUNT. The pattern is to place primary key column list into PARTITION BY part and search for non-unique values, additional check for nulls could be added too. If the column list is a valid candidate for Primary key, it will not return any rows, if there are rows violating the rules they will be returned:
-- checking if PK is applicable
SELECT *
FROM PRIMARY_KEY_TEST
QUALIFY COUNT(*) OVER(PARTITION BY PK) > 1
OR PK IS NULL;
-- chekcing if TEXT column is applicable
SELECT *
FROM PRIMARY_KEY_TEST
QUALIFY COUNT(*) OVER(PARTITION BY TEXT) > 1
OR TEXT IS NULL;
-- chekcing if PK,TEXT columns are applicable
SELECT *
FROM PRIMARY_KEY_TEST
QUALIFY COUNT(*) OVER(PARTITION BY PK,TEXT) > 1
OR PK IS NULL
OR TEXT IS NULL;
I'd still prefer code that can throw an error though
It is possible using Snowflake Scripting and RAISE exception:
BEGIN
LET my_exception EXCEPTION (-20002, 'Columns cannot be used as PK.');
IF (EXISTS(SELECT *
FROM PRIMARY_KEY_TEST
QUALIFY COUNT(*) OVER(PARTITION BY PK) > 1
OR PK IS NULL
)) THEN
RAISE my_exception;
END IF;
END;
-20002 (P0001): Uncaught exception of type 'MY_EXCEPTION' on line 8 at position 5 : Columns cannot be used as PK.
you can enforce NOT NULL in Snowflake by adding a NOT NULL constraint on the columns which you want not to be null able.
The primary key constraint is informational only; It is not enforced when you insert the data into a table. For primary key you will have to either remove / delete the data or before inserting you will have to check if the data exists then you only update.
Depending upon what you are doing you may use the following
Merge (insert and update)
Use Distinct to check if the row exist then update or delete the old and insert the new.
You could use ROW_NUMBER analytical function to identify the duplicates.

Whilst locked - see if otherID exists, if so return mainID, if not insert otherID & return new mainID

I need to check if a unique int - otherID (not the primary key) exists in a table. If it does, return the primary key. If it doesn't, insert a record containing the otherID I checked, then return the new primary key.
ID unqiueidentifier, --PK
name varchar(100),
otherID int --Unique
I need to holdlock whatever statement I use because between checking and inserting a concurrent user could insert the same otherID.
I was looking at using a MERGE with holdlock - but it seems that can only be used for INSERT / UPDATE / DELETE - not for selecting.
I'm using this from Dapper in an ASP.net MVC 5 app.
I would like to do this in a single database roundtrip if possible.
Other than a MERGE I'm not even sure what to search for on Google - I don't know if this is possible?!
I want to avoid the chance of a race condition / unique key violation.
We can do a conditional INSERT, followed by a plain SELECT:
DECLARE #OtherID int
INSERT INTO TableA (OtherID)
SELECT #OtherID
WHERE NOT EXIST (SELECT * from TableA where OtherID = #OtherID)
SELECT MainID from TableA where OtherID = #OtherID

SQL Server update trigger with no primary key

I have a table with no primary key. This is beyond my control and I can't change it.
I need to add a trigger that updates one column on an update. Is there a way with no primary key?
If need to do:
update myTable
set someField = someValue
where myTable.pkID = inserted.pdID
However, I don't have a primary key so I don't know how to do the where clause.
You don't have to have a formal primary key per se, but you need some combination of values to identify the record(s) affected-- even if that is every column in the table:
UPDATE myTable
SET someField = someValue
FROM myTable INNER JOIN inserted ON
myTable.[col1] = inserted.[col1]
AND myTable.[col2] = inserted.[col2]
AND myTable.[col3] = inserted.[col3]
--etc
If you don't have some combination of values that is unique, I'm afraid you'd be out of luck with this approach.

SQL Server DB Trigger

I need to insert a row into a table if a particular row in another table is updated.
How do I do an IF statement in the the DB trigger on Table1, saying if Table1.column1 = 'TC' then INSERT a row in Table2.
You would do this in an update trigger on the 'other' table.
There are two special tables in triggers: inserted and deleted. You join these two tables in such a way that the result set is the rows you wish to insert. Ergo -
create trigger [after_update_on_Table1] on [Table1] for update
as
...
insert
into [Table2] (...)
select
...
from
inserted as i
inner join
deleted as d
on (i.<*pk*> = d.<*pk*>)
where
<*other conditions if applicable*>
...
<pk> is whatever the appropriate primary key would be. If this is a compound primary key then AND together the different primary key components.
For what you describe thus far you do not require an if statement.

How cascade Update/Delete works internally in SQL Server?

Ok, I believe the question was not clear. Here i rewrite this in other way.
Suppose i create two tables,
table1(c1 int PRIMARY KEY)
table2(table1c11 int)
There is a relation between table1 and table2
i.e. table1.c1=table2.table1c11
And, i execute the following statement in the table1 and table2
insert into table1(c1)
values('a'),('b'),('c'),('d'),('e')
insert into table2(table1c11)
values('a'),('a'),('b'),('d')
And now what I want to achieve is that, once I update the value of c1 in table1 the corresponding data in table2 gets changed automatically. For this I need to create the constraint in table1 and table2 relationships and apply the CASCADE UPDATE.
So, later I apply a new SQL update statement in table1 i.e.
Update table1 set c1=c1+'updated'
Then the data in table2 gets changed also, But what if I want to achieve the same functionality via INSTEAD OF UPDATE TRIGGER, then I need to write the instead of update trigger and inside that, I need to handle that with two magic tables INSERTED and DELETED.
But the main point is that, in this case, I have only one column present in the table1 and I am updating that same column, so how could i map the inserted and deleted rows. Same thing is being done by the SQL Server as well if I use CASCADing.
So, the question arises how SQL Server handles batch update in case of the primary key data changes in the table.
So, the question arises how SQL Server handles batch update in case of
the primary key data changes in the table.
SQL Server builds a query plan for the update statement that update both tables.
Create the tables:
create table T1
(
T1ID int primary key
);
create table T2
(
T2ID int primary key,
T1ID int references T1(T1ID) on update cascade
)
Add some data:
insert into T1 values(1), (2)
insert into T2 values(1, 1), (2, 1), (3, 2)
Update primary key of T1:
update T1
set T1.T1ID = 3
where T1.T1ID = 1
The query plan for the update looks like this:
The plan has two Clustered Index Update steps, one for T1 and one for T2.
Update 1:
How does SQL Server keep track of the rows to update when more than one primary key value is updated?
update T1
set T1.T1ID = T1.T1ID + 100
The Eager Spool in the top branch (update of T1) saves the old T1ID and the new calculated T1ID (Expr1013) to a temporary table that is used by the lower branch (update of T2). The Hash Match in the lower branch is joining the Table Spool with T2 on the old T1ID. Output from the Hash Match to the update of T2 is T2ID from the Clustered Index Scan of T2 and the new calculated T1ID (Expr1013) from the Table Spool.
Update 2:
If you need to replace the cascade update with a instead of trigger you need to have a way to join the inserted and deleted tables in the trigger. That can be done with a surrogate key in T1.
Tables:
create table T1
(
T1ID int primary key,
ID int identity unique
);
create table T2
(
T2ID int primary key,
T1ID int references T1(T1ID)
);
The trigger could look like this.
create trigger tr_T1 on T1 instead of update as
insert into T1(T1ID)
select T1ID
from inserted;
update T2
set T1ID = I.T1ID
from inserted as I
inner join deleted as D
on I.ID = D.ID
where D.T1ID = T2.T1ID;
delete from T1
where T1ID in (
select T1ID
from deleted
);
SQL Fiddle

Resources