Table variable in SQL Server - sql-server

I am using SQL Server 2005. I have heard that we can use a table variable to use instead of LEFT OUTER JOIN.
What I understand is that, we have to put all the values from the left table to the table variable, first. Then we have to UPDATE the table variable with the right table values. Then select from the table variable.
Has anyone come across this kind of approach? Could you please suggest a real time example (with query)?
I have not written any query for this. My question is - if someone has used a similar approach, I would like to know the scenario and how it is handled. I understand that in some cases it may be slower than the LEFT OUTER JOIN.
Please assume that we are dealing with tables that have less than 5000 records.
Thanks

It can be done, but I have no idea why you would ever want to do it.
This realy does seem like it is being done backwards. But if you are trying this for your own learning only, here goes:
DECLARE #MainTable TABLE(
ID INT,
Val FLOAT
)
INSERT INTO #MainTable SELECT 1, 1
INSERT INTO #MainTable SELECT 2, 2
INSERT INTO #MainTable SELECT 3, 3
INSERT INTO #MainTable SELECT 4, 4
DECLARE #LeftTable TABLE(
ID INT,
MainID INT,
Val FLOAT
)
INSERT INTO #LeftTable SELECT 1, 1, 11
INSERT INTO #LeftTable SELECT 3, 3, 33
SELECT *,
mt.Val + ISNULL(lt.Val, 0)
FROM #MainTable mt LEFT JOIN
#LeftTable lt ON mt.ID = lt.MainID
DECLARE #Table TABLE(
ID INT,
Val FLOAT
)
INSERT INTO #Table
SELECT ID,
Val
FROM #MainTable
UPDATE #Table
SET Val = t.Val + lt.Val
FROM #Table t INNER JOIN
#LeftTable lt ON t.ID = lt.ID
SELECT *
FROM #Table

I don't think it's very clear from your question what you want to achieve? (What your tables look like, and what result you want). But you can certainly select data into a variable of a table datatype, and tamper with it. It's quite convenient:
DECLARE #tbl TABLE (id INT IDENTITY(1,1), userId int, foreignId int)
INSERT INTO #tbl (userId)
SELECT id FROM users
WHERE name LIKE 'a%'
UPDATE #tbl t
SET
foreignId = (SELECT id FROM foreignTable f WHERE f.userId = t.userId)
In that example I gave the table variable an identity column of its own, distinct from the one in the source table. I often find that useful. Adjust as you like... Again, it's not very clear what the question is, but I hope this might guide you in the right direction...?

Every scenario is different, and without full details on a specific case it's difficult to say whether it would be a good approach for you.
Having said that, I would not be looking to use the table variable approach unless I had a specific functional reason to - if the query can be fulfilled with a standard SELECT query using an OUTER JOIN, then I'd use that as I'd expect that to be most efficient.
The times where you may want to use a temp table/table variable instead, are more when you want to get an intermediary resultset and then do some processing on it before then returning it out - i.e. the kind of processing that cannot be done with a straight forward query.
Note the table variables are very handy, but take into account that they are not guaranteed to reside in-memory - they can get persisted to tempdb like standard temp tables.

Thank you, astander.
I tried with an example given below. Both of the approaches took 19 seconds. However, I guess some tuning will help the Table varaible update approach to become faster than LEFT JOIN.
AS I am not a master in tuning I request your help. Any SQL expert ready to prove it?
---- PLease replace "" with '' below. I am not familiar with how to put code in this forum... It causes some troubles....
CREATE TABLE #MainTable (
CustomerID INT PRIMARY KEY,
FirstName VARCHAR(100)
)
DECLARE #Count INT
SET #Count = 0
DECLARE #Iterator INT
SET #Iterator = 0
WHILE #Count <8000
BEGIN
INSERT INTO #MainTable SELECT #Count, "Cust"+CONVERT(VARCHAR(10),#Count)
SET #Count = #Count+1
END
CREATE TABLE #RightTable
(
OrderID INT PRIMARY KEY,
CustomerID INT,
Product VARCHAR(100)
)
CREATE INDEX [IDX_CustomerID] ON #RightTable (CustomerID)
WHILE #Iterator <400000
BEGIN
IF #Iterator % 2 = 0
BEGIN
INSERT INTO #RightTable SELECT #Iterator,2, "Prod"+CONVERT(VARCHAR(10),#Iterator)
END
ELSE
BEGIN
INSERT INTO #RightTable SELECT #Iterator,1, "Prod"+CONVERT(VARCHAR(10),#Iterator)
END
SET #Iterator = #Iterator+1
END
-- Using LEFT JOIN
SELECT mt.CustomerID,mt.FirstName,COUNT(rt.Product) [CountResult]
FROM #MainTable mt
LEFT JOIN #RightTable rt ON mt.CustomerID = rt.CustomerID
GROUP BY mt.CustomerID,mt.FirstName
---------------------------
-- Using Table variable Update
DECLARE #WorkingTableVariable TABLE
(
CustomerID INT,
FirstName VARCHAR(100),
ProductCount INT
)
INSERT
INTO #WorkingTableVariable (CustomerID,FirstName)
SELECT CustomerID, FirstName FROM #MainTable
UPDATE #WorkingTableVariable
SET ProductCount = [Count]
FROM #WorkingTableVariable wt
INNER JOIN
(SELECT CustomerID,COUNT(rt.Product) AS [Count]
FROM #RightTable rt
GROUP BY CustomerID) IV ON wt.CustomerID = IV.CustomerID
SELECT CustomerID,FirstName, ISNULL(ProductCount,0) [CountResult] FROM #WorkingTableVariable
ORDER BY CustomerID
--------
DROP TABLE #MainTable
DROP TABLE #RightTable
Thanks
Lijo

In my opinion there is one reason to do this:
If you have a complicated query with lots of inner joins and one left join you sometimes get in trouble because this query is hundreds of times less fast than using the same query without the left join.
If you query lots of records with a result of very few records to be joined to the left join you could get faster results if you materialize the intermediate result into a table variable or temp table.
But usually there is no need to really update the data in the table variable - you could query the table variable using the left join to return the result.
... just my two cents.

Related

Efficient query to filter for list of values across columns/distinct rows

SQL Server version is 2016+/Azure SqlDb (flexible if additive, compatible with both, forward-compatible).
Use case is API users sending a single-column list of values to filter some target table. The target table has 2-n columns whose values are unique across rows (maybe columns, doesn't matter) for the table/range being queried. (So far n <= 5, but that's a detail/not guaranteed.)
Here's a good-enough sample table DDL:
IF NOT EXISTS (SELECT 1 FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'SomeTable')
BEGIN
CREATE TABLE dbo.SomeTable (
ID int IDENTITY(1, 1) not null PRIMARY KEY CLUSTERED
, NaturalKey1 nvarchar(10) not null UNIQUE NONCLUSTERED
, NaturalKey2 nvarchar(10) not null UNIQUE NONCLUSTERED
, NaturalKey3 nvarchar(10) not null UNIQUE NONCLUSTERED
);
END
IF NOT EXISTS (SELECT 1 FROM dbo.SomeTable)
BEGIN
INSERT INTO dbo.SomeTable VALUES
('A', 'AA', 'ZZZZZ')
,('B', 'B', 'YYYYY')
,('C', 'CC', 'XXX')
,('D', 'DDD', 'WWWWW')
,('E', 'EEEE', 'V')
,('F', 'FF', 'UUUUUUUUU')
,('G', 'GGGGGGGG', 's')
-- lots more
;
END
SELECT * FROM dbo.SomeTable;
-- DROP TABLE dbo.SomeTable;
Assumptions are that all NaturalKey columns are of same type (probably nvarchar); filtering happens db-side; and in as few steps as possible, ideally one execution, in a stored procedure. Parameter will be string list or TVP, doesn't matter really. Result will include all data in any row of SomeTable that matches any value on any column. Target table is of unknown size.
Here's an example parameter for our pal above:
DECLARE #filterValues nvarchar(1000) = 'DDD,XXX,E,HH,ok,whatever,YYYYY';
SELECT * FROM string_split(#filterValues, ',');
I know a couple ways to do this, and can imagine several more, so it's not that kind of stuck. I'll bet someone knows a better trick than either of the two I'll illustrate.
Approach 1 Build a temp table updated for existence and join on it (concise and nice to audit, that's about it for pros)
DECLARE #filterValues nvarchar(1000) = 'DDD,XXX,E,HH,ok,whatever,YYYYY';
SELECT value AS InValue, CONVERT(int, null) AS IDMatch
INTO #filters
FROM string_split(#filterValues, ',');
UPDATE f
SET f.IDMatch = st.ID
FROM #filters AS f
INNER JOIN dbo.SomeTable AS st ON f.InValue IN (st.NaturalKey1, st.NaturalKey2, st.NaturalKey3);
SELECT * FROM #filters; -- Audit
SELECT st.* FROM #filters AS f INNER JOIN dbo.SomeTable AS st ON f.IDMatch = st.ID;
IF OBJECT_ID('tempdb..#filters') IS NOT NULL DROP TABLE #filters;
Approach 2 Unpivot SomeTable (I like the nifty cross apply trick) and just join (at scale there be ogres methinks)
SELECT
st.*
FROM
dbo.SomeTable AS st
CROSS APPLY (VALUES (st.NaturalKey1)
, (st.NaturalKey2)
, (st.NaturalKey3)
) AS nk(Val)
INNER JOIN #filters AS f ON nk.Val = f.InValue;
IF OBJECT_ID('tempdb..#filters') IS NOT NULL DROP TABLE #filters;
Is there a question in our future
Works is better than doesn't work, but looking for better/more efficient/more scalable methods from the T-SQL gurus. Unknown dimensions in columns and rows, response time is an SLA, filter size may or may not be capped. Bonus points if this ports neatly from SomeTable to SomeTableVersionN. (No dynamic SQL in an API.)
Could be dupe question, couldn't find it, pointing that out is just fine thank you.

How do I loop through a table, search with that data, and then return search criteria and result to new table?

I have a set of records that need to be validated (searched) in a SQL table. I will call these ValData and SearchTable respectively. A colleague created a SQL query in which a record from the ValData can be copied and pasted in to a string variable, and then it is searched in the SearchTable. The best result from the SearchTable is returned. This works very well.
I want to automate this process. I loaded the ValData to SQL in a table like so:
RowID INT, FirstName, LastName, DOB, Date1, Date2, TextDescription.
I want to loop through this set of data, by RowID, and then create a result table that is the ValData joined with the best match from the SearchTable. Again, I already have a query that does that portion. I just need the loop portion, and my SQL skills are virtually non-existent.
Suedo code would be:
DECLARE #SearchID INT = 1
DECLARE #MaxSearchID INT = 15000
DECLARE #FName VARCHAR(50) = ''
DECLARE #FName VARCHAR(50) = ''
etc...
WHILE #SearchID <= #MaxSearchID
BEGIN
SET #FNAME = (SELECT [Fname] FROM ValData WHERE [RowID] = #SearchID)
SET #LNAME = (SELECT [Lname] FROM ValData WHERE [RowID] = #SearchID)
etc...
Do colleague's query, and then insert(?) search criteria joined with the result from the SearchTable in to a temporary result table.
END
SELECT * FROM FinalResultTable;
My biggest lack of knowledge comes in how do I create a temporary result table that is ValData's fields + SearchTable's fields, and during the loop iterations how do I add one row at a time to this temporary result table that includes the ValData joined with the result from the SearchTable?
If it helps, I'm using/wanting to join all fields from ValData and all fields from SearchTable.
Wouldn't this be far easier with a query like this..?
SELECT FNAME,
LNAME
FROM ValData
WHERE (FName = #Fname
OR LName = #Lname)
AND RowID <= #MaxSearchID
ORDER BY RowID ASC;
There is literally no reason to use a WHILE other than to destroy performance of the query.
With a bit more trial and error, I was able to answer what I was looking for (which, at its core, was creating a temp table and then inserting rows in to it).
CREATE TABLE #RESULTTABLE(
[feedname] VARCHAR(100),
...
[SCORE] INT,
[Max Score] INT,
[% Score] FLOAT(4),
[RowID] SMALLINT
)
SET #SearchID = 1
SET #MaxSearchID = (SELECT MAX([RowID]) FROM ValidationData
WHILE #SearchID <= #MaxSearchID
BEGIN
SET #FNAME = (SELECT [Fname] FROM ValidationData WHERE [RowID] = #SearchID)
...
--BEST MATCH QUERY HERE
--Select the "top" best match (order not guaranteed) in to the RESULTTABLE.
INSERT INTO #RESULTTABLE
SELECT TOP 1 *, #SearchID AS RowID
--INTO #RESULTTABLE
FROM #TABLE3
WHERE [% Score] IN (SELECT MAX([% Score]) FROM #TABLE3)
--Drop temp tables that were created/used during best match query.
DROP TABLE #TABLE1
DROP TABLE #TABLE2
DROP TABLE #TABLE3
SET #SearchID = #SearchID + 1
END;
--Join the data that was validated (searched) to the results that were found.
SELECT *
FROM ValidationData vd
LEFT JOIN #RESULTTABLE rt ON rt.[RowID] = vd.[RowID]
ORDER BY vd.[RowID]
DROP TABLE #RESULTTABLE
I know this could be approved by doing a join, probably with the "BEST MATCH QUERY" as an inner query. I am just not that skilled yet. This takes a manual process which took hours upon hours and shortens it to just an hour or so.

TSQL infinite loop in cursor when joining temp table only

I have a SP that updates prices in a table using a CURSOR and FETCH NEXT.
The cursor's query is a simple "select', and instead of a WHERE - I'm joining a function who brings all the related User IDs.
Now, due to the fact the my SP updates a lot of different tables (attached is only part of it), and in every update I use the JOIN of the users function - I wanted to save some time, and bring the Users list only once, at the beginning of the SP.
If I join my temp table (#t, which holds the User IDs) I get an infinite loop.
At the end the prices will be infinite.
BUT - If I join the function itself, instead of the temp table that should have exactly the same values in it, everything is fine.
I commented the problematic JOIN.
-- This is my main object, who "holds" a lot of users
DECLARE #DistID INT = 123
DECLARE #t TABLE
(MainDistID int,
DistID int,
UserID int)
INSERT INTO #t
SELECT MainDistID,
DistID,
UserID
FROM [MyScheme].[FnGetAllDistAndUsers] (#DistID)
DECLARE #Bid INT -- Will contain the ID we need to update
DECLARE c CURSOR LOCAL FOR
SELECT ID
FROM BillingDetails AS bd
-- BOTH JOINS SHOULD BE THE SAME:
--JOIN #t AS GUsers -- Infinite loop...
-- ON bd.UserID = GUsers.UserID
JOIN [MyScheme].[FnGetAllDistAndUsers] (#DistID) AS GUsers -- NO infinite loop
ON bd.UserID = GUsers.UserID
OPEN c
FETCH NEXT FROM c
INTO #Bid
WHILE ##FETCH_STATUS = 0
BEGIN
UPDATE MyTable
SET Price = Price * 2
WHERE MyTableID = #Bid
FETCH NEXT FROM c
INTO #Bid
END
Thanks!
I think Mitch is right you shouldn't have to use a cursor. When doing updates it is fastest if you join on the primary key, so you could do something like this.
DECLARE #t TABLE (Bid int)
INSERT INTO #t
SELECT bd.ID
FROM [MyScheme].[FnGetAllDistAndUsers] u (#DistID)
INNER JOIN BillingDetails bd
ON u.UserID = bd.UserID
UPDATE mt
SET Price = Price * 2
FROM MyTable mt
INNER JOIN #t t
ON mt.MyTableID = t.Bid

How optimize a TSQL query using a while (fetching and adding data in the same table)?

I'm writing a procedure who get all the parts and sub parts of a specific machine. But, it's possible than a sub part have also sub parts and so on. At the top level, the part have a revision number, after that, I take the maximum revision number of the sub part, because i want the latest (each revision have the same sub parts, its just the drawing who change and I don't care about it). So far i got this:
CREATE TABLE [dbo].[PartMtl_3](
[PartNum] [nvarchar](50) ,
[RevisionNum] [nvarchar](16) NULL,
[MtlPartNum] [nvarchar](70)
) ON [PRIMARY]
--Here I can't put a primary key because
--it should be PartNum and RevisionNum and MtlPartNum together
-- but i know some of the data have a null in the revision
-- and tsql don't support a pk null
Input parameter for the procedure #Machine VARCHAR(30)
DECLARE #Mytable TABLE
(
id INT IDENTITY(1,1) NOT NULL,
PartNum VARCHAR(70),
RevisionNum VARCHAR(16),
Processed TINYINT,
ParentId INT
)
DECLARE #ID INT;
DECLARE #PartNum VARCHAR(70)
DECLARE #RevisionNum VARCHAR(16)
INSERT INTO #Mytable(PartNum,RevisionNum,Processed,ParentId)
SELECT PartNum,RevisionNum,0,NULL WHERE Machine=#Machine
--Whith this insert I have my top parts for the machine
WHILE (SELECT COUNT(*) FROM #Mytable WHERE Processed=0)>0
BEGIN
SELECT TOP 1 #ID=id,#PartNum=PartNum,#RevisionNum=RevisionNum Where Processed=0
INSERT INTO #MyTable(PartNum,RevisionNum,Processed,ParentId)
SELECT MtlPartNum,(SELECT MAX(RevisionNum) FROM PartMtl_3
WHERE PartNum=MtlPartNum) AS RevisionNum,0,#Id FROM PartMtl_3
WHERE PartNum=#PartNum AND RevisionNum=#RevisionNum
UPDATE #Matable SET Processed=1 Where id=#Id
END
--other code here to basically insert the result of #Mytable in a real table
I know this procedure work, but it's really slow and eat all the resources of the server. So is it possible to optimize it by using a cursor (I think its quite close to so it so I doubt about that approach) or a set base operation?
Basically, what I would do, is create a table variable to hold all the unprocessed parts you retrieved for each pass.
This would allow you to loop less often and update multiples entry at the same time.
Also, I would create a CTE to compute the MaxRevisionNum in advance, this would remove the need to rerun the query for each record (that is basically what happens when you put a query in a field.
Then I would use join to insert what is required and update the processed part in #MyTable at the end of the loop.
This is what the code should look like.
It might need some tweaking since it couldn't be tested against your set of data, but it should be really close to what you need.
declare #Unprocessed TABLE
(
id INT IDENTITY(1,1) NOT NULL,
PartNum VARCHAR(70),
RevisionNum VARCHAR(16)
)
WHILE (SELECT COUNT(*) FROM #Mytable WHERE Processed=0)>0
BEGIN
delete from #Unprocessed
insert into #Unprocessed
select id, PartNum, RevisionNum
FROM #Mytable Where Processed=0
;WITH cte_partNum as
( select PartNum, MAX(RevisionNum) as RevisionNum
FROM PartMtl_3 group by PartNum )
INSERT INTO #MyTable(PartNum,RevisionNum,Processed,ParentId)
SELECT PartMtl_3.MtlPartNum, cte_partNum.RevisionNum ,0, Unprocessed.id
FROM PartMtl_3
inner join cte_partNum
on PartMtl_3.MtlPartNum = cte_partNum.PartNum
inner join #Unprocessed as Unprocessed
on Unprocessed.PartNum = PartMtl_3.PartNum
and Unprocessed.RevisionNum = PartMtl_3.RevisionNum
UPDATE tbl
set Processed = 1
from #Matable tbl
inner join #Unprocessed as Unprocessed
on tbl.id = Unprocessed.id
END
Some correction made based on OP comments.

Using merge..output to get mapping between source.id and target.id

Very simplified, I have two tables Source and Target.
declare #Source table (SourceID int identity(1,2), SourceName varchar(50))
declare #Target table (TargetID int identity(2,2), TargetName varchar(50))
insert into #Source values ('Row 1'), ('Row 2')
I would like to move all rows from #Source to #Target and know the TargetID for each SourceID because there are also the tables SourceChild and TargetChild that needs to be copied as well and I need to add the new TargetID into TargetChild.TargetID FK column.
There are a couple of solutions to this.
Use a while loop or cursors to insert one row (RBAR) to Target at a time and use scope_identity() to fill the FK of TargetChild.
Add a temp column to #Target and insert SourceID. You can then join that column to fetch the TargetID for the FK in TargetChild.
SET IDENTITY_INSERT OFF for #Target and handle assigning new values yourself. You get a range that you then use in TargetChild.TargetID.
I'm not all that fond of any of them. The one I used so far is cursors.
What I would really like to do is to use the output clause of the insert statement.
insert into #Target(TargetName)
output inserted.TargetID, S.SourceID
select SourceName
from #Source as S
But it is not possible
The multi-part identifier "S.SourceID" could not be bound.
But it is possible with a merge.
merge #Target as T
using #Source as S
on 0=1
when not matched then
insert (TargetName) values (SourceName)
output inserted.TargetID, S.SourceID;
Result
TargetID SourceID
----------- -----------
2 1
4 3
I want to know if you have used this? If you have any thoughts about the solution or see any problems with it? It works fine in simple scenarios but perhaps something ugly could happen when the query plan get really complicated due to a complicated source query. Worst scenario would be that the TargetID/SourceID pairs actually isn't a match.
MSDN has this to say about the from_table_name of the output clause.
Is a column prefix that specifies a table included in the FROM clause of a DELETE, UPDATE, or MERGE statement that is used to specify the rows to update or delete.
For some reason they don't say "rows to insert, update or delete" only "rows to update or delete".
Any thoughts are welcome and totally different solutions to the original problem is much appreciated.
In my opinion this is a great use of MERGE and output. I've used in several scenarios and haven't experienced any oddities to date.
For example, here is test setup that clones a Folder and all Files (identity) within it into a newly created Folder (guid).
DECLARE #FolderIndex TABLE (FolderId UNIQUEIDENTIFIER PRIMARY KEY, FolderName varchar(25));
INSERT INTO #FolderIndex
(FolderId, FolderName)
VALUES(newid(), 'OriginalFolder');
DECLARE #FileIndex TABLE (FileId int identity(1,1) PRIMARY KEY, FileName varchar(10));
INSERT INTO #FileIndex
(FileName)
VALUES('test.txt');
DECLARE #FileFolder TABLE (FolderId UNIQUEIDENTIFIER, FileId int, PRIMARY KEY(FolderId, FileId));
INSERT INTO #FileFolder
(FolderId, FileId)
SELECT FolderId,
FileId
FROM #FolderIndex
CROSS JOIN #FileIndex; -- just to illustrate
DECLARE #sFolder TABLE (FromFolderId UNIQUEIDENTIFIER, ToFolderId UNIQUEIDENTIFIER);
DECLARE #sFile TABLE (FromFileId int, ToFileId int);
-- copy Folder Structure
MERGE #FolderIndex fi
USING ( SELECT 1 [Dummy],
FolderId,
FolderName
FROM #FolderIndex [fi]
WHERE FolderName = 'OriginalFolder'
) d ON d.Dummy = 0
WHEN NOT MATCHED
THEN INSERT
(FolderId, FolderName)
VALUES (newid(), 'copy_'+FolderName)
OUTPUT d.FolderId,
INSERTED.FolderId
INTO #sFolder (FromFolderId, toFolderId);
-- copy File structure
MERGE #FileIndex fi
USING ( SELECT 1 [Dummy],
fi.FileId,
fi.[FileName]
FROM #FileIndex fi
INNER
JOIN #FileFolder fm ON
fi.FileId = fm.FileId
INNER
JOIN #FolderIndex fo ON
fm.FolderId = fo.FolderId
WHERE fo.FolderName = 'OriginalFolder'
) d ON d.Dummy = 0
WHEN NOT MATCHED
THEN INSERT ([FileName])
VALUES ([FileName])
OUTPUT d.FileId,
INSERTED.FileId
INTO #sFile (FromFileId, toFileId);
-- link new files to Folders
INSERT INTO #FileFolder (FileId, FolderId)
SELECT sfi.toFileId, sfo.toFolderId
FROM #FileFolder fm
INNER
JOIN #sFile sfi ON
fm.FileId = sfi.FromFileId
INNER
JOIN #sFolder sfo ON
fm.FolderId = sfo.FromFolderId
-- return
SELECT *
FROM #FileIndex fi
JOIN #FileFolder ff ON
fi.FileId = ff.FileId
JOIN #FolderIndex fo ON
ff.FolderId = fo.FolderId
I would like to add another example to add to #Nathan's example, as I found it somewhat confusing.
Mine uses real tables for the most part, and not temp tables.
I also got my inspiration from here: another example
-- Copy the FormSectionInstance
DECLARE #FormSectionInstanceTable TABLE(OldFormSectionInstanceId INT, NewFormSectionInstanceId INT)
;MERGE INTO [dbo].[FormSectionInstance]
USING
(
SELECT
fsi.FormSectionInstanceId [OldFormSectionInstanceId]
, #NewFormHeaderId [NewFormHeaderId]
, fsi.FormSectionId
, fsi.IsClone
, #UserId [NewCreatedByUserId]
, GETDATE() NewCreatedDate
, #UserId [NewUpdatedByUserId]
, GETDATE() NewUpdatedDate
FROM [dbo].[FormSectionInstance] fsi
WHERE fsi.[FormHeaderId] = #FormHeaderId
) tblSource ON 1=0 -- use always false condition
WHEN NOT MATCHED
THEN INSERT
( [FormHeaderId], FormSectionId, IsClone, CreatedByUserId, CreatedDate, UpdatedByUserId, UpdatedDate)
VALUES( [NewFormHeaderId], FormSectionId, IsClone, NewCreatedByUserId, NewCreatedDate, NewUpdatedByUserId, NewUpdatedDate)
OUTPUT tblSource.[OldFormSectionInstanceId], INSERTED.FormSectionInstanceId
INTO #FormSectionInstanceTable(OldFormSectionInstanceId, NewFormSectionInstanceId);
-- Copy the FormDetail
INSERT INTO [dbo].[FormDetail]
(FormHeaderId, FormFieldId, FormSectionInstanceId, IsOther, Value, CreatedByUserId, CreatedDate, UpdatedByUserId, UpdatedDate)
SELECT
#NewFormHeaderId, FormFieldId, fsit.NewFormSectionInstanceId, IsOther, Value, #UserId, CreatedDate, #UserId, UpdatedDate
FROM [dbo].[FormDetail] fd
INNER JOIN #FormSectionInstanceTable fsit ON fsit.OldFormSectionInstanceId = fd.FormSectionInstanceId
WHERE [FormHeaderId] = #FormHeaderId
Here's a solution that doesn't use MERGE (which I've had problems with many times I try to avoid if possible). It relies on two memory tables (you could use temp tables if you want) with IDENTITY columns that get matched, and importantly, using ORDER BY when doing the INSERT, and WHERE conditions that match between the two INSERTs... the first one holds the source IDs and the second one holds the target IDs.
-- Setup... We have a table that we need to know the old IDs and new IDs after copying.
-- We want to copy all of DocID=1
DECLARE #newDocID int = 99;
DECLARE #tbl table (RuleID int PRIMARY KEY NOT NULL IDENTITY(1, 1), DocID int, Val varchar(100));
INSERT INTO #tbl (DocID, Val) VALUES (1, 'RuleA-2'), (1, 'RuleA-1'), (2, 'RuleB-1'), (2, 'RuleB-2'), (3, 'RuleC-1'), (1, 'RuleA-3')
-- Create a break in IDENTITY values.. just to simulate more realistic data
INSERT INTO #tbl (Val) VALUES ('DeleteMe'), ('DeleteMe');
DELETE FROM #tbl WHERE Val = 'DeleteMe';
INSERT INTO #tbl (DocID, Val) VALUES (6, 'RuleE'), (7, 'RuleF');
SELECT * FROM #tbl t;
-- Declare TWO temp tables each with an IDENTITY - one will hold the RuleID of the items we are copying, other will hold the RuleID that we create
DECLARE #input table (RID int IDENTITY(1, 1), SourceRuleID int NOT NULL, Val varchar(100));
DECLARE #output table (RID int IDENTITY(1,1), TargetRuleID int NOT NULL, Val varchar(100));
-- Capture the IDs of the rows we will be copying by inserting them into the #input table
-- Important - we must specify the sort order - best thing is to use the IDENTITY of the source table (t.RuleID) that we are copying
INSERT INTO #input (SourceRuleID, Val) SELECT t.RuleID, t.Val FROM #tbl t WHERE t.DocID = 1 ORDER BY t.RuleID;
-- Copy the rows, and use the OUTPUT clause to capture the IDs of the inserted rows.
-- Important - we must use the same WHERE and ORDER BY clauses as above
INSERT INTO #tbl (DocID, Val)
OUTPUT Inserted.RuleID, Inserted.Val INTO #output(TargetRuleID, Val)
SELECT #newDocID, t.Val FROM #tbl t
WHERE t.DocID = 1
ORDER BY t.RuleID;
-- Now #input and #output should have the same # of rows, and the order of both inserts was the same, so the IDENTITY columns (RID) can be matched
-- Use this as the map from old-to-new when you are copying sub-table rows
-- Technically, #input and #output don't even need the 'Val' columns, just RID and RuleID - they were included here to prove that the rules matched
SELECT i.*, o.* FROM #output o
INNER JOIN #input i ON i.RID = o.RID
-- Confirm the matching worked
SELECT * FROM #tbl t

Resources