I'm writing an import process which will import data from one (somewhat legacy) database to another. The import process takes one flat table with the source data. I have this populating a temp table (#SourcePersonAccount) at the start. The goal is to distribute this data into three destination tables (dbo.Person, dbo.Account & dbo.PersonAccount). This runs within a trigger on a table use SQL Server Replication, so needs to run quickly.
#SourcePersonAccount([AccountNumber], [CompanyId], [TargetPersonId], [TargetAccountId]);
dbo.Person ([Id] pk identity(1,1), [CompanyId], ...);
dbo.Account ([Id] pk identity(1,1), [AccountNumber], ...);
dbo.PersonAccount ([Id], [PersonId] fk_Person_Id, [AccountId] fk_Account_Id);
In my code, I have the TargetPersonId already populated in the #SourcePersonAccount temp table. All that's left is to 1) insert into dbo.Account, 2) update #SourcePersonAccount with the inserted dbo.Account.Id value, 3) insert into dbo.PersonAccount.
One of the challenges is that the AccountNumber and CompanyId make up a composite primary key of the source table, so both are needed to join properly on the #SourcePersonAccount temp table.
I have seen threads addressing similar issues to a certain extent here and here which did not solve my particular problem, mostly due to performance issues.
As stated in this post, the OUTPUT clause cannot output columns that were not included in the insert, so that is not an option here.
One solution I saw that technically can give the desired output (I can't find the link to where I found the suggestion) while using the OUTPUT clause is to actually add and drop a column within the query.
DECLARE #PersonAccountTbl TABLE ([AccountId] INT, [AccountNumber] INT, [CompanyId] INT);
ALTER TABLE [dbo].[Account]
ADD [CompanyId] INT NULL;
INSERT INTO [dbo].[Account]
([AccountNumber], [CompanyId])
OUTPUT INSERTED.[Id], INSERTED.[AccountNumber], INSERTED.[CompanyId]
INTO #PersonAccountTbl
SELECT
[AccountNumber], [CompanyId]
FROM #SourcePersonAccount
WHERE
[TargetAccountId] IS NULL;
ALTER TABLE [dbo].[Account]
DROP COLUMN [CompanyId];
This is not a viable option for my situation.
I tried using MERGE as every thread I've found on this issue recommends using it. I do not like MERGE for a few reasons. I tried it anyways; the below code gives the desired output, but ended up being much too slow for my purposes.
DECLARE #PersonAccountTbl TABLE ([AccountId] INT, [AccountNumber] INT, [CompanyId] INT);
MERGE INTO [dbo].[Account] a
USING #SourcePersonAccount spa
ON spa.[TargetAccountId] IS NULL
WHEN NOT MATCHED THEN
INSERT
([AccountNumber])
VALUES
(spa.[AccountNumber])
OUTPUT INSERTED.[Id], INSERTED.[AccountNumber], spa.[CompanyId]
INTO #PersonAccountTbl ([AccountId], [AccountNumber], [CompanyId]);
UPDATE spa
SET spa.[TargetAccountId] = pat.[AccountId]
FROM #SourcePersonAccount spa
JOIN #PersonAccountTbl pat
ON pat.[AccountNumber] = spa.[AccountNumber]
AND pat.[CompanyId] = spa.[CompanyId];
INSERT INTO [dbo].[PersonAccount]
([PersonId], [AccountId])
SELECT
spa.[TargetPersonId], spa.[TargetAccountId]
FROM #SourcePersonAccount spa
LEFT JOIN [dbo].[PersonAccount] pa
ON pa.[PersonId] = spa.[TargetPersonId]
AND pa.[AccountId] = spa.[TargetAccountId]
WHERE
pa.[Id] IS NULL;
Is there a way other than MERGE or adding/dropping a column to accomplish this?
You can use a SEQUENCE instead of an IDENTITY column. Then you can assign the IDs to a temp table or table variable before you INSERT the data.
Related
I have a utility script that is used to insert data into tables in my database. The script has a number of temp table in it that stores the new data to be inserted and a lot of it is related.
So, for example I have tables like so
DECLARE #Table1 TABLE
(
Table1ID INT
Table1Description VARCHAR(50)
Table1Code VARCHAR(5)
)
DECLARE #Table2 TABLE
(
Table2ID INT
Table2Description VARCHAR(50)
Table2Code VARCHAR(5)
)
DECLARE #Relationships TABLE
(
Table1Code VARCHAR(5)
Table2Code VARCHAR(5)
)
So the script populates the data in #Table1 and #Table2, but doesn't populate the ID fields. Once the data has been MERGEd into the database tables, I update the Table1ID and Table2ID fields in a separate statement as they are auto incrementing fields. Then when I use the #Relationships table to populate the database table, I can join to #Table1 and #Table2 to get the actual ID values.
I'm updating the script and I'm wondering if I can MERGE the data from #Table1/#Table2 into the database and update the ID fields in the temp table as part of the MERGE statement using the OUTPUT clause all in one statement?
I think the answer is no as I can't find anything mentioning updating an existing table with the OUTPUT clause, only inserting into a table.
I am still able to do what I need to do, so I'm not after alternatives. I just wondering if it is possible using the OUTPUT Clause
Thanks in advance
I am trying to run the following merge statement to insert a row:
MERGE sales.Widget
USING (
VALUES ('19668651', 4.75))
AS widg (WidgetId, WidgetCost)
ON 1=0
WHEN NOT MATCHED THEN
INSERT (WidgetId, WidgetCost)
VALUES (widg.WidgetId, widg.WidgetCost)
OUTPUT INSERTED.WidgetId
INTO #inserted;
GO
I am confused by the error I am getting:
The column reference "inserted.WidgetId" is not allowed because it refers to a base table that is not being modified in this statement.
I thought that the inserted table was just an in-memory table of the values being passed in to the merge statement.
Why then would it care if I am modifying a "base" table as long as the value was passed in?
I can clearly tell that this is related to the fact that I have a view with an INSTEAD OF INSERT trigger on it (because it works fine against a normal table).
But why does SQL Server not just return the value that was passed in? (WidgetId in this case.)
Here is the script to reproduce the error:
CREATE SCHEMA sales
GO
-- Create the base table
CREATE TABLE sales.Widget_OLD(
WIDGET_ID int NOT NULL,
WIDGET_COST money NOT NULL
CONSTRAINT PK_Widget PRIMARY KEY CLUSTERED (WIDGET_ID ASC)
)
GO
-- Create the overlay view
CREATE VIEW sales.Widget AS
SELECT widg.WIDGET_ID AS WidgetId, widg.WIDGET_COST AS WidgetCost
FROM sales.Widget_OLD widg
GO
-- create the instead of insert trigger
CREATE TRIGGER sales.InsertWidget ON sales.Widget
INSTEAD OF INSERT AS
BEGIN
INSERT INTO sales.Widget_OLD (WIDGET_ID, WIDGET_COST)
SELECT Inserted.WidgetId, inserted.WidgetCost
FROM Inserted
END
GO
DECLARE #inserted TABLE (WidgetId varchar(11) NOT null);
MERGE sales.Widget
USING (
VALUES ('19668651', 4.75))
AS widg (WidgetId, WidgetCost)
ON 1=0
WHEN NOT MATCHED THEN
INSERT (WidgetId, WidgetCost)
VALUES (widg.WidgetId, widg.WidgetCost)
OUTPUT INSERTED.WidgetId
INTO #inserted;
GO
-- Clean up
DROP TRIGGER sales.InsertWidget
DROP VIEW sales.Widget
DROP TABLE sales.Widget_OLD
DROP SCHEMA sales
go
NOTE: This is from my Entity Framework Core application when I try to do 3+ inserts (see this question for more details) That question is about how to stop EF Core from using MERGE. This one is to understand what is happening.
I need to insert records into a production table. The problem is that one of the fields needs to be the same value as the primary key.
In the example below, the Insert query is dropping '99' into [AlsoMyID]. But that's just a placeholder. It needs to be whatever value is going into [MyID].
How do I write the Insert query so that the system will add the same PK value to both [MyID] and [AlsoMyID]?
Drop table #mylittletable
Create table #Mylittletable (
[MyID] int IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
[AlsoMyID] int,
[ActualData] varchar(1))
Select * from #Mylittletable
Insert into #Mylittletable values (99,'x')
Select * from #Mylittletable
If you're interested in the background, the developer is using AlsoMyID as a linking field so any number of records can be linked together using the original primary key value. That said, I have no control over the table structure.
Firstly, you cannot specify the value for identity column unless you use set identity_insert on. so according to your requirement, you need to insert the same value to AlsoMyID as MyID.
You can work it out as flowing:
insert into Mylittletable
select ##IDENTITY+1,'1'
With this trigger on the table you can insert anything on the alsoMyID-column and that will be overwritten with what get's set in the myID-column.
create trigger tr_Mylittletable ON Mylittletable
AFTER INSERT AS
BEGIN
declare #ID int = (select MyID from inserted)
update Mylittletable set AlsoMyID = #ID where MyID = #ID
END
NOTE: This only works when making inserts of one line at a time!
I ran into a deadlock issue synchronizing a table multiple times in a short period of time. By synchronize I mean doing the following:
Insert data to be synchronized into a temp table
Update existing records in destination table
Insert new records into the destination table
Delete records that are not in the synch table under certain
circumstances
Drop temp table
For the INSERT and DELETE statements, I'm using a LEFT JOIN similar to:
INSERT INTO destination_table (fk1, fk2, val1)
FROM #tmp
LEFT JOIN destination_table dt ON dt.fk1 = #tmp.fk1
AND dt.fk2 = #temp.fk2
WHERE dt.pk IS NULL;
The deadlock graph is reporting the destination_table's primary key is under an exclusive lock. I assume the above query is causing a table or page lock instead of a row lock. How would I confirm that?
I could rewrite the above query with an IN, EXIST or EXCEPT command. Are there any additional ways of refactoring the code? Will refactoring using any of these commands avoid the deadlock issue? Which one would be the best? I'm assuming EXCEPT.
Well under normal circumstances I could execute scenario pretty well. Given below is the test script I created. Are you trying something else?
drop table #destination_table
drop table #tmp
Declare #x int=0
create table #tmp(fk1 int, fk2 int, val int)
set #x=2
while (#x<1000)
begin
insert into #tmp
select #x,#x,100
set #x=#x+3
end
create table #destination_table(fk1 int, fk2 int, val int)
while (#x<1000)
begin
insert into #destination_table
select #x,#x,100
set #x=#x+1
end
INSERT INTO #destination_table (fk1, fk2, val)
select t.*
FROM #tmp t
LEFT JOIN #destination_table dt ON dt.fk1 = t.fk1
AND dt.fk2 = t.fk2
WHERE dt.fk1 IS NULL
This question already has answers here:
Using merge..output to get mapping between source.id and target.id
(3 answers)
Closed 10 years ago.
The Situation:
I am inserting information from one table to another, a source and target. When the information is inserted into the target, a primary key is created. (In this case it is an integer.) I then need to be able to tie back to the source table. However, based on the data being moved, I am not able to reliably get the 1:1 match between the target and source tables.
The Question:
Is there a way to copy the primary key that was created for record(x) in the target table and copy it as a foreign key to that same record(x) in the source table as the bulk insert is happening?
Details:
I am trying to get this done in SQL. I have a work-around to this problem but I figure there has to be a way to do what I'm asking.
I found my answer after reading this great article.
http://sqlblog.com/blogs/adam_machanic/archive/2009/08/24/dr-output-or-how-i-learned-to-stop-worrying-and-love-the-merge.aspx
I acheived what I was looking for by using a MERGE and its OUTPUT clause. Here is my sample code that I used to figure this out.
I started by creating 3 temporary tables, #Temp2, #Temp3 and #Temp4. #Temp2 is considered the source table. #Temp3 would be the target table and #Temp4 is a bridge. I then inserted a few rows of very simple data, in this case just one field - Value.
CREATE TABLE #Temp2(
OldID INT IDENTITY(1,1),
Value INT,
NewFK INT)
CREATE TABLE #Temp3(
NewerID INT IDENTITY(1,1),
Value INT)
CREATE TABLE #Temp4(
OldID INT NOT NULL,
NewerID INT NOT NULL,
Value INT)
INSERT INTO #Temp2(Value)
VALUES(30), (40), (50), (70)
INSERT INTO #Temp3(Value)
VALUES (333), (444), (555), (777)
Then comes the MERGE statement that does the dirty work. It will be taking the value from #Temp2 and putting it into #Temp3. It will then take the ID created in #Temp3, the ID from #Temp2 and the Value that was passed, and throw them all into #Temp4.
MERGE INTO #Temp3 AS tgt
USING #Temp2 AS src
ON 1=0
WHEN NOT MATCHED THEN
INSERT(
Value)
VALUES(
src.Value)
OUTPUT
src.OldID,
INSERTED.NewerID,
src.Value
INTO #Temp4(OldID, NewerID, Value);
Then I ran an UPDATE to the staging table #Temp2 to update the NewFK field with the new ID. Lastly, do a simple SELECT to see the updated information.
UPDATE X
SET X.NewFK = Z.NewerID
FROM #Temp2 X
JOIN #Temp4 Z
ON X.OldID = Z.OldID
SELECT * FROM #Temp2
This acheived exactly what I needed and is a pretty streamlined way of doing things. I hope this will help some people who come across this question. Thanks everyone for your insight and responses.
NOTE:
I believe MERGE was introduced in SQL Server 2008.
Jonathan
One approach would be to set identity insert for your target table to 'on' (http://msdn.microsoft.com/en-us/library/ms188059.aspx). Then make that identity part of your 'source' data before you run the insert. Just remember to turn identity insert back off again once you're done.
EDIT
Not sure what your situation is, but one apporach I've taken in the past is to create a field to hold 'external source ID', just in case I needed to refer back to the source at some point in the future. In my case, this was for reference only, not normal transactional use.
If you can get a SharedExtPK in the target then this should work.
In this case logID is the PK of the source.
Tested:
DECLARE #MyTableVar table(
TargetPK int NOT NULL,
SourcePK int NOT NULL
);
INSERT INTO IdenOutPut (someValue, sharedExtKey)
OUTPUT INSERTED.iden,
INSERTED.sharedExtKey
INTO #MyTableVar
SELECT name, logID
FROM CatID
update sPK
set sPK.ExtPK = tTbl.TargetPK
FROM #MyTableVar as tTbl
JOIN CatID as sPK
on sPK.logID = tTbl.SourcePK
GO
If the values you insert are unique then could use that.
But it would get trickier.