SQL capture output from Merge into temp table - sql-server

I followed this code from another post:
MERGE [Destination] AS d
USING (SELECT * FROM #SourceForMerge) AS s
ON d.PrimaryKey = s.PrimaryKey
WHEN MATCHED THEN
-- UPDATE columns
WHEN NOT MATCHED THEN
-- INSERT columns
OUTPUT
-- columns
INTO
#OutputTempTable
I was using an actual table before (with a INSERT INTO <Table> before the merge, but I added a foreign key, which is not allowed. So I am trying to import into a temp table , then into my main table. The documentation says I can use a temp table and that the column list is not mandatory. However , I can't even get it to work , even if I declare the columns.
OUTPUT #BatchId , $action into #OutputTempTable ( batchid, what ) ;
I am getting error
Invalid object name '#OutputTempTable'
I know I am probably missing something basic, but been a long week.
Thanks

Related

MERGE into first table based on id of second table if not matched

I have two tables table_1 and table_2. After inserting some data into table_1(insert not in the example below), it gives Auto_Increment ID to table_1. Then I need to put this new generated ID into table_2 in NOT MATCHED section.
I am trying to use T-SQL's MERGE, to UPDATE table if data already exists (if matched) or INSERT INTO table if there is no such data(not matched), but in second case insert by using one column selected from another table.
Here is what I have already tried:
MERGE
INTO table_2 WITH (HOLDLOCK) AS target
USING (SELECT
'42' AS person_id
,2 AS skill_id
) AS source
(person_id,skill_id )
ON (target.person_id = source.person_id
AND target.skill_id = source.skill_id)
WHEN MATCHED
THEN UPDATE
SET skill_lvl=4,already_have=0
WHEN NOT MATCHED
--section below doesn't work,because insert inside MERGE has to be without select (?)
THEN INSERT (person_id, skill_id, skill_lvl,already_have)
SELECT 42, id,3,1 FROM table_1;
Not matched section gives me an error that he waits values or default, but it seems kind of tricky to select with values or default.
Edit_1
Insert query to table_1 (happens before previous MERGE. Both MERGES within one loop):
MERGE
INTO table_1 WITH (HOLDLOCK) AS target
USING (SELECT
'skill_1' AS skill_name
) AS source
(skill_name)
ON (target.skill_name = source.skill_name)
WHEN NOT MATCHED
THEN INSERT (category_id,skill_name) values (0,'skill_1');
this query in the loop, compares skill_names, if name is not inside this table_1, it inserts this value. Then compare next skill_name and so on. ID's are generating automatically after inserting.

Splitting multiple fields by delimiter

I have to write an SP that can perform Partial Updates on our databases, the changes are stored in a record of the PU table. A values fields contains all values, delimited by a fixed delimiter. A tables field refers to a Schemes table containing the column names for each table in a similar fashion in a Colums fiels.
Now for my SP I need to split the Values field and Columns field in a temp table with Column/Value pairs, this happens for each record in the PU table.
An example:
Our PU table looks something like this:
CREATE TABLE [dbo].[PU](
[Table] [nvarchar](50) NOT NULL,
[Values] [nvarchar](max) NOT NULL
)
Insert SQL for this example:
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Person','John Doe;26');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Person','Jane Doe;22');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Person','Mike Johnson;20');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Person','Mary Jane;24');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Course','Mathematics');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Course','English');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Course','Geography');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Campus','Campus A;Schools Road 1;Educationville');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Campus','Campus B;Schools Road 31;Educationville');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Campus','Campus C;Schools Road 22;Educationville');
And we have a Schemes table similar to this:
CREATE TABLE [dbo].[Schemes](
[Table] [nvarchar](50) NOT NULL,
[Columns] [nvarchar](max) NOT NULL
)
Insert SQL for this example:
INSERT INTO [dbo].[Schemes]([Table],[Columns]) VALUES ('Person','[Name];[Age]');
INSERT INTO [dbo].[Schemes]([Table],[Columns]) VALUES ('Course','[Name]');
INSERT INTO [dbo].[Schemes]([Table],[Columns]) VALUES ('Campus','[Name];[Address];[City]');
As a result the first record of the PU table should result in a temp table like:
The 5th will have:
Finally, the 8th PU record should result in:
You get the idea.
I tried use the following query to create the temp tables, but alas it fails when there's more that one value in the PU record:
DECLARE #Fields TABLE
(
[Column] INT,
[Value] VARCHAR(MAX)
)
INSERT INTO #Fields
SELECT TOP 1
(SELECT Value FROM STRING_SPLIT([dbo].[Schemes].[Columns], ';')),
(SELECT Value FROM STRING_SPLIT([dbo].[PU].[Values], ';'))
FROM [dbo].[PU] INNER JOIN [dbo].[Schemes] ON [dbo].[PU].[Table] = [dbo].[Schemes].[Table]
TOP 1 correctly gets the first PU record as each PU record is removed once processed.
The error is:
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
In the case of a Person record, the splits are indeed returning 2 values/colums at a time, I just want to store the values in 2 records instead of getting an error.
Any help on rewriting the above query?
Also do note that the data is just generic nonsense. Being able to have 2 fields that both have delimited values, always equal in amount (e.g. a 'person' in the PU table will always have 2 delimited values in the field), and break them up in several column/header rows is the point of the question.
UPDATE: Working implementation
Based on the (accepted) answer of Sean Lange, I was able to work out followin implementation to overcome the issue:
As I need to reuse it, the combine column/value functionality is performed by a new function, declared as such:
CREATE FUNCTION [dbo].[JoinDelimitedColumnValue]
(#splitValues VARCHAR(8000), #splitColumns VARCHAR(8000),#pDelimiter CHAR(1))
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
WITH MyValues AS
(
SELECT ColumnPosition = x.ItemNumber,
ColumnValue = x.Item
FROM dbo.DelimitedSplit8K(#splitValues, #pDelimiter) x
)
, ColumnData AS
(
SELECT ColumnPosition = x.ItemNumber,
ColumnName = x.Item
FROM dbo.DelimitedSplit8K(#splitColumns, #pDelimiter) x
)
SELECT cd.ColumnName,
v.ColumnValue
FROM MyValues v
JOIN ColumnData cd ON cd.ColumnPosition = v.ColumnPosition
;
In case of the above sample data, I'd call this function with the following SQL:
DECLARE #FieldValues VARCHAR(8000), #FieldColumns VARCHAR(8000)
SELECT TOP 1 #FieldValues=[dbo].[PU].[Values], #FieldColumns=[dbo].[Schemes].[Columns] FROM [dbo].[PU] INNER JOIN [dbo].[Schemes] ON [dbo].[PU].[Table] = [dbo].[Schemes].[Table]
INSERT INTO #Fields
SELECT [Column] = x.[ColumnName],[Value] = x.[ColumnValue] FROM [dbo].[JoinDelimitedColumnValue](#FieldValues, #FieldColumns, #Delimiter) x
This data structure makes this way more complicated than it should be. You can leverage the splitter from Jeff Moden here. http://www.sqlservercentral.com/articles/Tally+Table/72993/ The main difference of that splitter and all the others is that his returns the ordinal position of each element. Why all the other splitters don't do this is beyond me. For things like this it is needed. You have two sets of delimited data and you must ensure that they are both reassembled in the correct order.
The biggest issue I see is that you don't have anything in your main table to function as an anchor for ordering the results correctly. You need something, even an identity to ensure the output rows stay "together". To accomplish I just added an identity to the PU table.
alter table PU add RowOrder int identity not null
Now that we have an anchor this is still a little cumbersome for what should be a simple query but it is achievable.
Something like this will now work.
with MyValues as
(
select p.[Table]
, ColumnPosition = x.ItemNumber
, ColumnValue = x.Item
, RowOrder
from PU p
cross apply dbo.DelimitedSplit8K(p.[Values], ';') x
)
, ColumnData as
(
select ColumnName = replace(replace(x.Item, ']', ''), '[', '')
, ColumnPosition = x.ItemNumber
, s.[Table]
from Schemes s
cross apply dbo.DelimitedSplit8K(s.Columns, ';') x
)
select cd.[Table]
, v.ColumnValue
, cd.ColumnName
from MyValues v
join ColumnData cd on cd.[Table] = v.[Table]
and cd.ColumnPosition = v.ColumnPosition
order by v.RowOrder
, v.ColumnPosition
I recommended not storing values like this in the first place. I recommend having a key value in the tables and preferably not using Table and Columns as a composite key. I recommend to avoid using reserved words. I also don't know what version of SQL you are using. I am going to assume you are using a fairly recent version of Microsoft SQL Server that will support my provided stored procedure.
Here is an overview of the solution:
1) You need to convert both the PU and the Schema table into a table where you will have each "column" value in the list of columns isolated in their own row. If you can store the data in this format rather than the provided format, you will be a little better off.
What I mean is
Table|Columns
Person|Jane Doe;22
needs converted to
Table|Column|OrderInList
Person|Jane Doe|1
Person|22|2
There are multiple ways to do this, but I prefer an xml trick that I picked up. You can find multiple split string examples online so I will not focus on that. Use whatever gives you the best performance. Unfortunately, You might not be able to get away from this table-valued function.
Update:
Thanks to Shnugo's performance enhancement comment, I have updated my xml splitter to give you the row number which reduces some of my code. I do the exact same thing to the Schema list.
2) Since the new Schema table and the new PU table now have the order each column appears, the PU table and the schema table can be joined on the "Table" and the OrderInList
CREATE FUNCTION [dbo].[fnSplitStrings_XML]
(
#List NVARCHAR(MAX),
#Delimiter VARCHAR(255)
)
RETURNS TABLE
AS
RETURN
(
SELECT y.i.value('(./text())[1]', 'nvarchar(4000)') AS Item,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) as RowNumber
FROM
(
SELECT CONVERT(XML, '<i>'
+ REPLACE(#List, #Delimiter, '</i><i>')
+ '</i>').query('.') AS x
) AS a CROSS APPLY x.nodes('i') AS y(i)
);
GO
CREATE Procedure uspGetColumnValues
as
Begin
--Split each value in PU
select p.[Table],p.[Values],a.[Item],CHARINDEX(a.Item,p.[Values]) as LocationInStringForSorting,a.RowNumber
into #PuWithOrder
from PU p
cross apply [fnSplitStrings_XML](p.[Values],';') a --use whatever string split function is working best for you (performance wise)
--Split each value in Schema
select s.[Table],s.[Columns],a.[Item],CHARINDEX(a.Item,s.[Columns]) as LocationInStringForSorting,a.RowNumber
into #SchemaWithOrder
from Schemes s
cross apply [fnSplitStrings_XML](s.[Columns],';') a --use whatever string split function is working best for you (performance wise)
DECLARE #Fields TABLE --If this is an ETL process, maybe make this a permanent table with an auto incrementing Id and reference this table in all steps after this.
(
[Table] NVARCHAR(50),
[Columns] NVARCHAR(MAX),
[Column] VARCHAR(MAX),
[Value] VARCHAR(MAX),
OrderInList int
)
INSERT INTO #Fields([Table],[Columns],[Column],[Value],OrderInList)
Select pu.[Table],pu.[Values] as [Columns],s.Item as [Column],pu.Item as [Value],pu.RowNumber
from #PuWithOrder pu
join #SchemaWithOrder s on pu.[Table]=s.[Table] and pu.RowNumber=s.RowNumber
Select [Table],[Columns],[Column],[Value],OrderInList
from #Fields
order by [Table],[Columns],OrderInList
END
GO
EXEC uspGetColumnValues
GO
Update:
Since your working implementation is a table-valued function, I have another recommendation. The problem I see is that your using a table valued function which ultimately handles one record at a time. You are going to have better performance with set based operations and batching as needed. With a tabled valued function, you are likely going to be looping through each row. If this is some sort of ETL process, your team will be better off if you have a stored procedure that processes the rows in bulk. It might make sense to stage the results into a better table that your team can work with down stream rather than have them use a potentially slow table-valued function.

Error inserting comma separated values in Table: SQL Server 2008

I have two tables called 'ticket' and 'ticket_category'.
'Ticket' table has a column 'Ticket_Uid' and its type is 'UniqueIdentifier'.
Each 'ticket_uid' in 'ticket' table has one-to-many mappings in 'ticket_category' table.
E.g.
'Ticket' table:
Ticket_Uid
100
Ticket_Category:
Ticket_Uid Ticket_Category_Uid
100 ABC
100 DEF
100 XYZ
I want to create the following table named 'comment_mining':
Ticket_Uid Category_List
100 ABC,DEF,XYZ
The table has already been created using the following:
create table dbo.comment_mining
(
Ticket_UID [uniqueidentifier] NOT NULL,
time_created datetime,
time_closed datetime,
total_time_taken int,
category_list nvarchar(500),
comment_list varchar(8000)
);
I have already created this table and populated the 'Ticket_Uid' column.
For inserting into the 'category_list' column, I am using the following query:
insert into dbo.comment_mining(category_list)
SELECT
(SELECT distinct convert(varchar(500),category_id) + ',' AS [text()]
FROM ticket_category pr
WHERE pr.ticket_uid = p.ticket_uid
FOR XML PATH (''))
AS Category_list
FROM comment_mining p
When I run the above query, it gives me the following error:
Msg 515, Level 16, State 2, Line 1
Cannot insert the value NULL into column 'Ticket_UID', table 'Support_Saas.dbo.comment_mining'; column does not allow nulls. INSERT fails.
The statement has been terminated.
(which is strange as I am not even inserting in the 'Ticket_Uid' column)
When I run the same query without the insert statement, it executes perfectly. The query is as follows:
SELECT
(SELECT distinct convert(varchar(500),category_id) + ',' AS [text()]
FROM ticket_category pr
WHERE pr.ticket_uid = p.ticket_uid
FOR XML PATH (''))
AS Category_list
FROM comment_mining p
Yes there are some NULL values when the above query is run, but 'category_list' column in 'comment_mining' table can take NULL values. Why is the error on 'ticket_Uid' column?
Would someone please be able to explain why this is happening and what's the cure to this?
P.S. - I am new to SQL.
The reason you have the insert error on table comment_mining is because you set the Ticket_Uid column as not null; however, since it does not have a default value, the insert fails because whether you're inserting that field specifically or not, when a row is created, all columns must be filled in or be null.
You can do one of 2 things:
Change the structure of the comment_mining table to have a default value for Ticket_Uid (You can do this in the table designer or with code:
Example 1:
Alter Table comment_mining
Add Constraint DF_CommentMining_1 default NewID() for Ticket_UID
Make your insert explicitly include a generated uniqueidentifier (GUID) value by using the SQL NewID() function to populate the Ticket_UID UniqueIdentifier column
Example 2:
insert into dbo.comment_mining(Ticket_Uid, category_list)
SELECT NewID(),
[ your subquery ]...
In both cases, you're now satisfying the NOT NULL constraint on comment_mining.Ticket_UID, either by making it automatically populate itself, or by supplying a value.
try this,
;with cte as
(
select 100 Ticket_Uid,'ABC' Ticket_Category_Uid union all
select 100 , 'DEF' union all
select 100, 'XYZ'
)
select distinct b.Ticket_Uid,
stuff((select ','+a.Ticket_Category_Uid from cte a where a.Ticket_Uid=b.Ticket_Uid for xml path('')),1,1,'')
from cte b

Copy table rows using OUTPUT INTO in SQL Server 2005

I have a table which I need to copy records from back into itself. As part of that, I want to capture the new rows using an OUTPUT clause into a table variable so I can perform other opertions on the rows as well in the same process. I want each row to contain its new key and the key it was copied from. Here's a contrived example:
INSERT
MyTable (myText1, myText2) -- myId is an IDENTITY column
OUTPUT
Inserted.myId,
Inserted.myText1,
Inserted.myText2
INTO
-- How do I get previousId into this table variable AND the newly inserted ID?
#MyTable
SELECT
-- MyTable.myId AS previousId,
MyTable.myText1,
MyTable.myText2
FROM
MyTable
WHERE
...
SQL Server barks if the number of columns on the INSERT doesn't match the number of columns from the SELECT statement. Because of that, I can see how this might work if I added a column to MyTable, but that isn't an option. Previously, this was implemented with a cursor which is causing a performance bottleneck -- I'm purposely trying to avoid that.
How do I copy these records while preserving the copied row's key in a way that will achieve the highest possible performance?
I'm a little unclear as to the context - is this in an AFTER INSERT trigger.
Anyway, I can't see any way to do this in a single call. The OUTPUT clause will only allow you to return rows that you have inserted. What I would recommend is as follows:
DECLARE #MyTable (
myID INT,
previousID INT,
myText1 VARCHAR(20),
myText2 VARCHAR(20)
)
INSERT #MyTable (previousID, myText1, myText2)
SELECT myID, myText1, myText2 FROM inserted
INSERT MyTable (myText1, myText2)
SELECT myText1, myText2 FROM inserted
-- ##IDENTITY now points to the last identity value inserted, so...
UPDATE m SET myID = i.newID
FROM #myTable m, (SELECT ##IDENTITY - ROW_NUMBER() OVER(ORDER BY myID DESC) + 1 AS newID, myID FROM inserted) i
WHERE m.previousID = i.myID
...
Of course, you wouldn't put this into an AFTER INSERT trigger, because it will give you a recursive call, but you could do it in an INSTEAD OF INSERT trigger. I may be wrong on the recursive issue; I've always avoid the recursive call, so I've never actually found out. Using ##IDENTITY and ROW_NUMBER(), however, is a trick I've used several times in the past to do something similar.

What columns can be used in OUTPUT INTO clause?

I'm trying to build a mapping table to associate the IDs of new rows in a table with those that they're copied from. The OUTPUT INTO clause seems perfect for that, but it doesn't seem to behave according to the documentation.
My code:
DECLARE #Missing TABLE (SrcContentID INT PRIMARY KEY )
INSERT INTO #Missing
( SrcContentID )
SELECT cshadow.ContentID
FROM Private.Content AS cshadow
LEFT JOIN Private.Content AS cglobal ON cshadow.Tag = cglobal.Tag
WHERE cglobal.ContentID IS NULL
PRINT 'Adding new content headers'
DECLARE #Inserted TABLE (SrcContentID INT PRIMARY KEY, TgtContentID INT )
INSERT INTO Private.Content
( Tag, Description, ContentDate, DateActivate, DateDeactivate, SortOrder, CreatedOn, IsDeleted, ContentClassCode, ContentGroupID, OrgUnitID )
OUTPUT cglobal.ContentID, INSERTED.ContentID INTO #Inserted (SrcContentID, TgtContentID)
SELECT Tag, Description, ContentDate, DateActivate, DateDeactivate, SortOrder, CreatedOn, IsDeleted, ContentClassCode, ContentGroupID, NULL
FROM Private.Content AS cglobal
INNER JOIN #Missing AS m ON cglobal.ContentID = m.SrcContentID
Results in the error message:
Msg 207, Level 16, State 1, Line 34
Invalid column name 'SrcContentID'.
(line 34 being the one with the OUTPUT INTO)
Experimentation suggests that only rows that are actually present in the target of the INSERT can be selected in the OUTPUT INTO. But this contradicts the docs in the books online. The article on OUTPUT Clause has example E that describes a similar usage:
The OUTPUT INTO clause returns values
from the table being updated
(WorkOrder) and also from the Product
table. The Product table is used in
the FROM clause to specify the rows to
update.
Has anyone worked with this feature?
(In the meantime I've rewritten my code to do the job using a cursor loop, but that's ugly and I'm still curious)
You can do this with a MERGE in Sql Server 2008. Example code below:
--drop table A
create table A (a int primary key identity(1, 1))
insert into A default values
insert into A default values
delete from A where a>=3
-- insert two values into A and get the new primary keys
MERGE a USING (SELECT a FROM A) AS B(a)
ON (1 = 0) -- ignore the values, NOT MATCHED will always be true
WHEN NOT MATCHED THEN INSERT DEFAULT VALUES -- always insert here for this example
OUTPUT $action, inserted.*, deleted.*, B.a; -- show the new primary key and source data
Result is
INSERT, 3, NULL, 1
INSERT, 4, NULL, 2
i.e. for each row the new primary key (3, 4) and the old one (1, 2). Creating a table called e.g. #OUTPUT and adding " INTO #OUTPUT;" at the end of the OUTPUT clause would save the records.
I've verified that the problem is that you can only use INSERTED columns. The documentation seems to indicate that you can use from_table_name, but I can't seem to get it to work (The multi-part identifier "m.ContentID" could not be bound.):
TRUNCATE TABLE main
SELECT *
FROM incoming
SELECT *
FROM main
DECLARE #Missing TABLE (ContentID INT PRIMARY KEY)
INSERT INTO #Missing(ContentID)
SELECT incoming.ContentID
FROM incoming
LEFT JOIN main
ON main.ContentID = incoming.ContentID
WHERE main.ContentID IS NULL
SELECT *
FROM #Missing
DECLARE #Inserted TABLE (ContentID INT PRIMARY KEY, [Content] varchar(50))
INSERT INTO main(ContentID, [Content])
OUTPUT INSERTED.ContentID /* incoming doesn't work, m doesn't work */, INSERTED.[Content] INTO #Inserted (ContentID, [Content])
SELECT incoming.ContentID, incoming.[Content]
FROM incoming
INNER JOIN #Missing AS m
ON m.ContentID = incoming.ContentID
SELECT *
FROM #Inserted
SELECT *
FROM incoming
SELECT *
FROM main
Apparently the from_table_name prefix is only allowed on DELETE or UPDATE (or MERGE in 2008) - I'm not sure why:
from_table_name
Is a column prefix that specifies a table included in the FROM clause of a DELETE or UPDATE statement that is used to specify the rows to update or delete.
If the table being modified is also specified in the FROM clause, any reference to columns in that table must be qualified with the INSERTED or DELETED prefix.
I'm running into EXACTLY the same problem as you are, I feel your pain...
As far as I've been able to find out there's no way to use the from_table_name prefix with an INSERT statement.
I'm sure there's a viable technical reason for this, and I'd love to know exactly what it is.
Ok, found it, here's a forum post on why it doesn't work:
MSDN forums
I think I found a solution to this problem, it sadly involves a temporary table, but at least it'll prevent the creation of a dreaded cursor :)
What you need to do is add an extra column to the table you're duplicating records from and give it a 'uniqueidentifer' type.
then declare a temporary table:
DECLARE #tmptable TABLE (uniqueid uniqueidentifier, original_id int, new_id int)
insert the the data into your temp table like this:
insert into #tmptable
(uniqueid,original_id,new_id)
select NewId(),id,0 from OriginalTable
the go ahead and do the real insert into the original table:
insert into OriginalTable
(uniqueid)
select uniqueid from #tmptable
Now to add the newly created identity values to your temp table:
update #tmptable
set new_id = o.id
from OriginalTable o inner join #tmptable tmp on tmp.uniqueid = o.uniqueid
Now you have a lookup table that holds the new id and original id in one record, for your using pleasure :)
I hope this helps somebody...
(MS) If the table being modified is also specified in the FROM clause, any reference to columns in that table must be qualified with the INSERTED or DELETED prefix.
In your example, you can't use cglobal table in the OUTPUT unless it's INSERTED.column_name or DELETED.column_name:
INSERT INTO Private.Content
(Tag)
OUTPUT cglobal.ContentID, INSERTED.ContentID
INTO #Inserted (SrcContentID, TgtContentID)
SELECT Tag
FROM Private.Content AS cglobal
INNER JOIN #Missing AS m ON cglobal.ContentID = m.SrcContentID
What worked for me was a simple alias table, like this:
INSERT INTO con1
(Tag)
OUTPUT **con2**.ContentID, INSERTED.ContentID
INTO #Inserted (SrcContentID, TgtContentID)
SELECT Tag
FROM Private.Content con1
**INNER JOIN Private.Content con2 ON con1.id=con2.id**
INNER JOIN #Missing AS m ON con1.ContentID = m.SrcContentID

Resources