Related
I have a set of data which need to result in a new row in a table. Once this row is created I need to attach metadata in separate tables related to this information. That is I need to create my [Identity] first, get the GlobalId back from the row, and then attach [Accounts] and [Metadata] to it.
Inserting data and getting the Id of the inserted row is easy enough (see query below). But I'm stumped as to how I get the personnumber, firstname, and lastname inserted into this temporary table as well so I can continue with inserting the related data.
DECLARE #temp AS TABLE(
[GlobalId] BIGINT
,[Personnumber] NVARCHAR(100)
,[Firstname] NVARCHAR(100)
,[Lastname] NVARCHAR(100)
);
;WITH person AS
(
SELECT top 1
t.[Personnumber]
,t.[Firstname]
,t.[Lastname]
FROM [temp].[RawRoles] t
WHERE t.Personnumber NOT IN
(
SELECT i.Account FROM [security].[Accounts] i
)
)
INSERT INTO [security].[Identities] ([Created], [Updated])
-- how do i get real related values here and not my hard coded strings?
OUTPUT inserted.GlobalId, 'personnumber', 'firstname', 'lastname' INTO #temp
SELECT GETUTCDATE(), GETUTCDATE()
FROM person
P.S. Backstory.
Identities for me is just a holder of a global Id we will be using instead of actual personal numbers (equivalent of social security numbers) in other systems, this way only one location has sensitive numbers, and can relate multiple account identifications such as social security number or AD accounts to the same global id.
P.P.S I would prefer to avoid Cursors as the query is going to be moving around almost 2 million records on first run, and several thousand on a daily basis.
#PeterHe gave me an idea on how to solve this with MERGE
Got it working as follows. When all rows have been inserted I can query #temp to continue the rest of the inserts.
DECLARE #temp AS TABLE(
[action] NVARCHAR(20)
,[GlobalId] BIGINT
,[Personnumber] NVARCHAR(100)
,[Firstname] NVARCHAR(100)
,[Lastname] NVARCHAR(100)
);
;WITH person AS
(
SELECT top 1
t.[Personnumber]
,t.[Firstname]
,t.[Lastname]
FROM [temp].[RawRoles] t
WHERE t.Personnumber NOT IN
(
SELECT i.Account FROM [security].[Accounts] i
)
)
MERGE [security].[Identities] AS tar
USING person AS src
ON 0 = 1 -- all rows from src need to be inserted, ive already filtered out using CTE Query.
WHEN NOT MATCHED THEN
INSERT
(
[Created], [Updated]
)
VALUES
(
GETUTCDATE(), GETUTCDATE()
)
OUTPUT $action, inserted.GlobalId, src.[Personnumber], src.[Firstname], src.[Lastname] INTO #temp;
SELECT * FROM #temp
I have to write an SP that can perform Partial Updates on our databases, the changes are stored in a record of the PU table. A values fields contains all values, delimited by a fixed delimiter. A tables field refers to a Schemes table containing the column names for each table in a similar fashion in a Colums fiels.
Now for my SP I need to split the Values field and Columns field in a temp table with Column/Value pairs, this happens for each record in the PU table.
An example:
Our PU table looks something like this:
CREATE TABLE [dbo].[PU](
[Table] [nvarchar](50) NOT NULL,
[Values] [nvarchar](max) NOT NULL
)
Insert SQL for this example:
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Person','John Doe;26');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Person','Jane Doe;22');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Person','Mike Johnson;20');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Person','Mary Jane;24');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Course','Mathematics');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Course','English');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Course','Geography');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Campus','Campus A;Schools Road 1;Educationville');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Campus','Campus B;Schools Road 31;Educationville');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Campus','Campus C;Schools Road 22;Educationville');
And we have a Schemes table similar to this:
CREATE TABLE [dbo].[Schemes](
[Table] [nvarchar](50) NOT NULL,
[Columns] [nvarchar](max) NOT NULL
)
Insert SQL for this example:
INSERT INTO [dbo].[Schemes]([Table],[Columns]) VALUES ('Person','[Name];[Age]');
INSERT INTO [dbo].[Schemes]([Table],[Columns]) VALUES ('Course','[Name]');
INSERT INTO [dbo].[Schemes]([Table],[Columns]) VALUES ('Campus','[Name];[Address];[City]');
As a result the first record of the PU table should result in a temp table like:
The 5th will have:
Finally, the 8th PU record should result in:
You get the idea.
I tried use the following query to create the temp tables, but alas it fails when there's more that one value in the PU record:
DECLARE #Fields TABLE
(
[Column] INT,
[Value] VARCHAR(MAX)
)
INSERT INTO #Fields
SELECT TOP 1
(SELECT Value FROM STRING_SPLIT([dbo].[Schemes].[Columns], ';')),
(SELECT Value FROM STRING_SPLIT([dbo].[PU].[Values], ';'))
FROM [dbo].[PU] INNER JOIN [dbo].[Schemes] ON [dbo].[PU].[Table] = [dbo].[Schemes].[Table]
TOP 1 correctly gets the first PU record as each PU record is removed once processed.
The error is:
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
In the case of a Person record, the splits are indeed returning 2 values/colums at a time, I just want to store the values in 2 records instead of getting an error.
Any help on rewriting the above query?
Also do note that the data is just generic nonsense. Being able to have 2 fields that both have delimited values, always equal in amount (e.g. a 'person' in the PU table will always have 2 delimited values in the field), and break them up in several column/header rows is the point of the question.
UPDATE: Working implementation
Based on the (accepted) answer of Sean Lange, I was able to work out followin implementation to overcome the issue:
As I need to reuse it, the combine column/value functionality is performed by a new function, declared as such:
CREATE FUNCTION [dbo].[JoinDelimitedColumnValue]
(#splitValues VARCHAR(8000), #splitColumns VARCHAR(8000),#pDelimiter CHAR(1))
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
WITH MyValues AS
(
SELECT ColumnPosition = x.ItemNumber,
ColumnValue = x.Item
FROM dbo.DelimitedSplit8K(#splitValues, #pDelimiter) x
)
, ColumnData AS
(
SELECT ColumnPosition = x.ItemNumber,
ColumnName = x.Item
FROM dbo.DelimitedSplit8K(#splitColumns, #pDelimiter) x
)
SELECT cd.ColumnName,
v.ColumnValue
FROM MyValues v
JOIN ColumnData cd ON cd.ColumnPosition = v.ColumnPosition
;
In case of the above sample data, I'd call this function with the following SQL:
DECLARE #FieldValues VARCHAR(8000), #FieldColumns VARCHAR(8000)
SELECT TOP 1 #FieldValues=[dbo].[PU].[Values], #FieldColumns=[dbo].[Schemes].[Columns] FROM [dbo].[PU] INNER JOIN [dbo].[Schemes] ON [dbo].[PU].[Table] = [dbo].[Schemes].[Table]
INSERT INTO #Fields
SELECT [Column] = x.[ColumnName],[Value] = x.[ColumnValue] FROM [dbo].[JoinDelimitedColumnValue](#FieldValues, #FieldColumns, #Delimiter) x
This data structure makes this way more complicated than it should be. You can leverage the splitter from Jeff Moden here. http://www.sqlservercentral.com/articles/Tally+Table/72993/ The main difference of that splitter and all the others is that his returns the ordinal position of each element. Why all the other splitters don't do this is beyond me. For things like this it is needed. You have two sets of delimited data and you must ensure that they are both reassembled in the correct order.
The biggest issue I see is that you don't have anything in your main table to function as an anchor for ordering the results correctly. You need something, even an identity to ensure the output rows stay "together". To accomplish I just added an identity to the PU table.
alter table PU add RowOrder int identity not null
Now that we have an anchor this is still a little cumbersome for what should be a simple query but it is achievable.
Something like this will now work.
with MyValues as
(
select p.[Table]
, ColumnPosition = x.ItemNumber
, ColumnValue = x.Item
, RowOrder
from PU p
cross apply dbo.DelimitedSplit8K(p.[Values], ';') x
)
, ColumnData as
(
select ColumnName = replace(replace(x.Item, ']', ''), '[', '')
, ColumnPosition = x.ItemNumber
, s.[Table]
from Schemes s
cross apply dbo.DelimitedSplit8K(s.Columns, ';') x
)
select cd.[Table]
, v.ColumnValue
, cd.ColumnName
from MyValues v
join ColumnData cd on cd.[Table] = v.[Table]
and cd.ColumnPosition = v.ColumnPosition
order by v.RowOrder
, v.ColumnPosition
I recommended not storing values like this in the first place. I recommend having a key value in the tables and preferably not using Table and Columns as a composite key. I recommend to avoid using reserved words. I also don't know what version of SQL you are using. I am going to assume you are using a fairly recent version of Microsoft SQL Server that will support my provided stored procedure.
Here is an overview of the solution:
1) You need to convert both the PU and the Schema table into a table where you will have each "column" value in the list of columns isolated in their own row. If you can store the data in this format rather than the provided format, you will be a little better off.
What I mean is
Table|Columns
Person|Jane Doe;22
needs converted to
Table|Column|OrderInList
Person|Jane Doe|1
Person|22|2
There are multiple ways to do this, but I prefer an xml trick that I picked up. You can find multiple split string examples online so I will not focus on that. Use whatever gives you the best performance. Unfortunately, You might not be able to get away from this table-valued function.
Update:
Thanks to Shnugo's performance enhancement comment, I have updated my xml splitter to give you the row number which reduces some of my code. I do the exact same thing to the Schema list.
2) Since the new Schema table and the new PU table now have the order each column appears, the PU table and the schema table can be joined on the "Table" and the OrderInList
CREATE FUNCTION [dbo].[fnSplitStrings_XML]
(
#List NVARCHAR(MAX),
#Delimiter VARCHAR(255)
)
RETURNS TABLE
AS
RETURN
(
SELECT y.i.value('(./text())[1]', 'nvarchar(4000)') AS Item,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) as RowNumber
FROM
(
SELECT CONVERT(XML, '<i>'
+ REPLACE(#List, #Delimiter, '</i><i>')
+ '</i>').query('.') AS x
) AS a CROSS APPLY x.nodes('i') AS y(i)
);
GO
CREATE Procedure uspGetColumnValues
as
Begin
--Split each value in PU
select p.[Table],p.[Values],a.[Item],CHARINDEX(a.Item,p.[Values]) as LocationInStringForSorting,a.RowNumber
into #PuWithOrder
from PU p
cross apply [fnSplitStrings_XML](p.[Values],';') a --use whatever string split function is working best for you (performance wise)
--Split each value in Schema
select s.[Table],s.[Columns],a.[Item],CHARINDEX(a.Item,s.[Columns]) as LocationInStringForSorting,a.RowNumber
into #SchemaWithOrder
from Schemes s
cross apply [fnSplitStrings_XML](s.[Columns],';') a --use whatever string split function is working best for you (performance wise)
DECLARE #Fields TABLE --If this is an ETL process, maybe make this a permanent table with an auto incrementing Id and reference this table in all steps after this.
(
[Table] NVARCHAR(50),
[Columns] NVARCHAR(MAX),
[Column] VARCHAR(MAX),
[Value] VARCHAR(MAX),
OrderInList int
)
INSERT INTO #Fields([Table],[Columns],[Column],[Value],OrderInList)
Select pu.[Table],pu.[Values] as [Columns],s.Item as [Column],pu.Item as [Value],pu.RowNumber
from #PuWithOrder pu
join #SchemaWithOrder s on pu.[Table]=s.[Table] and pu.RowNumber=s.RowNumber
Select [Table],[Columns],[Column],[Value],OrderInList
from #Fields
order by [Table],[Columns],OrderInList
END
GO
EXEC uspGetColumnValues
GO
Update:
Since your working implementation is a table-valued function, I have another recommendation. The problem I see is that your using a table valued function which ultimately handles one record at a time. You are going to have better performance with set based operations and batching as needed. With a tabled valued function, you are likely going to be looping through each row. If this is some sort of ETL process, your team will be better off if you have a stored procedure that processes the rows in bulk. It might make sense to stage the results into a better table that your team can work with down stream rather than have them use a potentially slow table-valued function.
For example, I have 2 tables, which I need for my query, Property and Move for history of moving properties.
I must create a query which will return all properties + 1 additional boolean column, IsInService, which will have value true, in cases, when Move table has a record for property with DateTo = null and MoveTypeID = 1 ("In service").
I have created this query:
SELECT
[ID], [Name],
(SELECT COUNT(*)
FROM [Move]
WHERE PropertyID = p.ID
AND DateTo IS NULL
AND MoveTypeID = 1) AS IsInService
FROM
[Property] as p
ORDER BY
[Name] ASC
OFFSET 100500 ROWS FETCH NEXT 50 ROWS ONLY;
I'm not so strong in SQL, but as I know, subqueries are the evil :)
How to create high performance SQL query in my case, if it is expected that these tables will include millions of records?
I've updated the code based on your comment. If you need something else, please provide input and output data expected. This is about all I can do based on inference from the existing comments. Further, this isn't intended to give you an exact working solution. My intention was to give you a prototype from which you can build your solution.
That said:
The code below is the basic join that you need. However, keep in mind that indexing is probably going to play as big a part in performance as the structure of the table and the query. It doesn't matter how you query the tables if the indexes aren't there to support the queries once you reach a certain size. There are a lot of resources online for indexing but viewing querying plans should be at the top of your list.
As a note, your column [dbo].[Property] ([Name]) should probably be NVARCHAR to allow SQL to minimize data storage. Indexes on that column will then be smaller and searches/updates faster.
DECLARE #Property AS TABLE
(
[ID] INT
, [Name] NVARCHAR(100)
);
INSERT INTO #Property
([ID]
, [Name])
VALUES (1,N'A'),
(2,N'B'),
(3,N'C');
DECLARE #Move AS TABLE
(
[ID] INT
, [DateTo] DATE
, [MoveTypeID] INT
, [PropertyID] INT
);
INSERT INTO #Move
([ID]
, [DateTo]
, [MoveTypeID]
, [PropertyID])
VALUES (1,NULL,1,1),
(2,NULL,1,2),
(3,N'2017-12-07',1,2);
SELECT [Property].[ID] AS [property_id]
, [Property].[Name] AS [property_name]
, CASE
WHEN [Move].[DateTo] IS NULL
AND [Move].[MoveTypeID] = 1 THEN
N'true'
ELSE
N'false'
END AS [in_service]
FROM #Property AS [Property]
LEFT JOIN #Move AS [Move]
ON [Move].[PropertyID] = [Property].[ID]
WHERE [Move].[DateTo] IS NULL
AND [Move].[MoveTypeID] = 1;
I am updating and inserting the bulk records from web api(in form of data table) in my table using the Merge statement. The insert and update are working fine but when I am sending a corrupt value.Catch block is not handling that value. Here is my code :
//Creating type to fetch the data table from web api(have to go with this approach only)
CREATE TYPE [dbo].[CustomerType] AS TABLE(
[Id] [int] NULL,
[Name] [nvarchar](100) NULL,
[Country] [nvarchar](50) NULL,
[Date] [datetime] NULL
)
// Stored proc for update and insert using merge
CREATE PROCEDURE Update_Customers
#tblCustomers CustomerType READONLY
AS
BEGIN
BEGIN TRY
MERGE INTO Customers c1
USING #tblCustomers c2
ON c1.CustomerId=c2.Id
WHEN MATCHED THEN
UPDATE SET c1.Name = c2.Name
,c1.Country = c2.Country,
c1.date =c2.date
WHEN NOT MATCHED THEN
INSERT VALUES(c2.Id, c2.Name,c2.date, c2.Country);
END TRY
BEGIN CATCH
//My table for logging the error
INSERT INTO ERROR_LOG(ERROR_LINE,ERROR_MESSAGE,PROC_NAME)
VALUES (ERROR_LINE(),ERROR_MESSAGE(),ERROR_PROCEDURE)
END CATCH
END
Thanks in advance
The problem is SQL does not treat errors the way you think it does.
SQL Server maintains an ACID state.
Atomic: All or nothing. All work is broken into transactional statements, which must be successful or the entire modification/creation is reverted, or rollback, to the previous working state. You can be explicit what transaction a work is in by using BEGIN TRANSACTION/COMMIT TRANSACTION or using ROLLBACK TRANSACTION. Read More: Transaction Statements and ROLLBACK TRANSACTION
Consistent: Every transaction must leave SQL Server in a state that is valid. For example, while DROP TABLE MyTable; may be a valid transaction, if MyTable has dependencies (i.e. Foreign Key Constraints), then SQL Server would be left in an inconsistent state and will rollback the transaction to the last consistent state.
Isolated: Every transaction occurs in its own time and space. We say Serialized to be specific. Isolation allows for multiple, even similar statements to be made at one time on the Server and process every one of them exclusively from each other. The term Blocking refers to when a statement is waiting for a transaction to commit and a Dead Block is when both transactions are waiting for each other indefinitely.
Durability: Unlike a software program that lives in memory and can be lost by a sudden loss of power, SQL Server's transactions are permanent to the disk once committed. This provides finality to a statement. This is also where the LOG file comes in place, since it records what transactions are performed on the database.
READ MORE: ACID PROPERTIES
I mention all of this since your BEGIN TRY/CATCH block looks for these issues.
Treat your tables as sets of data
Recall SQL is based on the relational set theory. SQL is best when it can perform actions on groups/sets of data/objects. It is completely incompatible with inheritance, and using cursive logic is possible, but inefficient at best.
So in your ETL process, treat the data as a whole set.
Instead, by treating your data as a whole set, you can transform the minor misspellings/errors of the user/web interface, and isolate actual errors of datatypes that you wish to catalogue by using the predicate clause (WHERE/HAVING/ON)
An example of what you could do is similar to the following:
CREATE TABLE #ETLTable(ID INT NOT NULL
, Name VARCHAR(50) NOT NULL
, Comment VARCHAR(50) NULL
, Country VARCHAR(50) NOT NULL
, Dated VARCHAR(50)); --implicitly declared as allowing NULL values unless the type denies it
CREATE TABLE #MyTable (ID INT NOT NULL
, Name VARCHAR(50) NOT NULL
, Comment VARCHAR(50) NULL
, Country VARCHAR(50) NOT NULL
, Dated DATE);
CREATE TABLE #ERROR_TABLE (ObjectID INT NULL
, encrypted INT
, text VARCHAR(MAX)
, start_time DATETIME2
, Table_ID INT
, Table_Name VARCHAR(50)
, Table_Country VARCHAR(50)
, Table_Dated VARCHAR(20));
CREATE TABLE #ACTIONS ( [Action] VARCHAR(50)
, [inserted_ID] int
, inserted_Name VARCHAR(50)
, inserted_Country VARCHAR(50)
, inserted_Date Date
, deleted_ID int
, deleted_Name VARCHAR(50)
, deleted_Country VARCHAR(50)
, deleted_Date Date)
INSERT INTO #MyTable (ID, Name, Country, Dated)
VALUES (1, 'Mary', 'USA', '12/23/12')
, (2, 'Julio', 'Mexico', '12/25/12')
, (3, 'Marx', 'USA', '11/11/12')
, (4, 'Ann', 'USA', '11/27/12');
INSERT INTO #ETLTable(ID, Name, Country, Comment, Dated)
VALUES (1,'Mary', 'USA', 'Valid Date', '12/23/12')
, (2,'Julio', 'Mexico', 'Invalid Date', '12-25,12')
, (3,'Marx', 'USA', 'Valid but incorrect Date', '12-11/25') --this actually means YY-MM-DD
, (4,'Ann','USA', 'Not Matching Date', '12-23-12')
, (5, 'Hillary', 'USA', 'New Entry', '11/24/12');
/*SQL Server is fairly flexible to datatypes entries, so be explicit.
Note the date highlighted. This will fail your code since it is of datatype DATETIME. CAST is implicit anyways, and should not be depended on in important queries. Theoretically, you could catch this with your TRY/CATCH block, but the entire Merge statement would be rolled back...an expensive and probably unacceptable cost.
You should proof your INSERTIONS of errors before you start expensive transactions like the MERGE statement. In this example, I knew what dates were being entered and that only the Japanese version might be entered. Dates are very finicky (DATE has no format) and best handled outside the merge statement altogether. */
;WITH CTE AS (
SELECT ID, Name, Country, Comment, ISNULL(TRY_CAST(Dated AS Date), CAST(CONVERT(datetime, Dated, 11) AS DATE) ) AS Dated --TRY_CAST returns a NULL if it cannot succeed and will not fail your query.
FROM #ETLTable
WHERE ISDATE(Dated) = 1
)
MERGE INTO #MyTable TGT
USING CTE SRC ON TGT.ID = SRC.ID
AND TGT.Name = SRC.Name
WHEN MATCHED AND SRC.Dated > TGT.Dated
THEN UPDATE SET TGT.Dated = SRC.Dated
, TGT.Comment = SRC.Comment
WHEN NOT MATCHED BY TARGET
THEN INSERT(ID, Name, Country, Comment, Dated) VALUES (SRC.ID, SRC.Name, SRC.Country, SRC.Comment, SRC.DATED)
OUTPUT $action AS [Action]
, inserted.ID
, inserted.Name
, inserted.Country
, inserted.Dated
, deleted.ID
, deleted.Name
, deleted.Country
, deleted.Dated
INTO #Actions;
/* Note, you would have to run this query separately, as it only records active transactions. */
--CREATE PROC MyProject
--AS BEGIN
--WAITFOR DELAY '00:00:10'
--Print 'ME'
--END
;WITH CTE AS (
SELECT t.objectid, encrypted, text, start_time
FROM sys.dm_exec_requests AS r
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) AS t
INNER JOIN (SELECT object_id FROM sys.procedures
WHERE object_id = OBJECT_ID('MyProject') ) B ON t.objectid = B.object_id )
INSERT INTO #ERROR_TABLE (ObjectID, encrypted, text, start_time, Table_ID, Table_Name, Table_Country, Table_Dated)
SELECT CTE.objectid, CTE.encrypted, CTE.text, CTE.start_time, B.Table_ID, B.Table_Name, B.Table_Country, B.Table_Dated
FROM CTE
RIGHT OUTER JOIN ( SELECT ID AS Table_ID
, Name AS Table_Name
, Country AS Table_Country
, Dated AS Table_Dated
FROM #ETLTable
WHERE ISDATE(Dated) = 0) B ON 1 = 1
SELECT * FROM #ERROR_TABLE
SELECT * FROM #Actions
SELECT * FROM #MyTable
Note that the script does not break anything in the SQL Server and in fact separates the bad inputs BEFORE the Merge statement.
Notice the useful information in the #ERROR_TABLE. You can actually utilize and make recommendations from this information live. This is your goal in error catching.
As a reminder, understand your RAISERROR or TRY/CATCH will only work on entire transactions...a unit of work. If you need to gracefully allow a Merge statement to fail, fine. But understand that proper ETL will guarantee such expensive statements never fail.
So please, no matter what, perform all your ETL processes BEFORE you write to your final tables. This is the point of staging tables...so you can filter out illegal or misspellings in your bulk tables.
To do anything less is to be lazy and unprofessional. Learn the right habits and they will help save you from future disaster.
I have two tables called 'ticket' and 'ticket_category'.
'Ticket' table has a column 'Ticket_Uid' and its type is 'UniqueIdentifier'.
Each 'ticket_uid' in 'ticket' table has one-to-many mappings in 'ticket_category' table.
E.g.
'Ticket' table:
Ticket_Uid
100
Ticket_Category:
Ticket_Uid Ticket_Category_Uid
100 ABC
100 DEF
100 XYZ
I want to create the following table named 'comment_mining':
Ticket_Uid Category_List
100 ABC,DEF,XYZ
The table has already been created using the following:
create table dbo.comment_mining
(
Ticket_UID [uniqueidentifier] NOT NULL,
time_created datetime,
time_closed datetime,
total_time_taken int,
category_list nvarchar(500),
comment_list varchar(8000)
);
I have already created this table and populated the 'Ticket_Uid' column.
For inserting into the 'category_list' column, I am using the following query:
insert into dbo.comment_mining(category_list)
SELECT
(SELECT distinct convert(varchar(500),category_id) + ',' AS [text()]
FROM ticket_category pr
WHERE pr.ticket_uid = p.ticket_uid
FOR XML PATH (''))
AS Category_list
FROM comment_mining p
When I run the above query, it gives me the following error:
Msg 515, Level 16, State 2, Line 1
Cannot insert the value NULL into column 'Ticket_UID', table 'Support_Saas.dbo.comment_mining'; column does not allow nulls. INSERT fails.
The statement has been terminated.
(which is strange as I am not even inserting in the 'Ticket_Uid' column)
When I run the same query without the insert statement, it executes perfectly. The query is as follows:
SELECT
(SELECT distinct convert(varchar(500),category_id) + ',' AS [text()]
FROM ticket_category pr
WHERE pr.ticket_uid = p.ticket_uid
FOR XML PATH (''))
AS Category_list
FROM comment_mining p
Yes there are some NULL values when the above query is run, but 'category_list' column in 'comment_mining' table can take NULL values. Why is the error on 'ticket_Uid' column?
Would someone please be able to explain why this is happening and what's the cure to this?
P.S. - I am new to SQL.
The reason you have the insert error on table comment_mining is because you set the Ticket_Uid column as not null; however, since it does not have a default value, the insert fails because whether you're inserting that field specifically or not, when a row is created, all columns must be filled in or be null.
You can do one of 2 things:
Change the structure of the comment_mining table to have a default value for Ticket_Uid (You can do this in the table designer or with code:
Example 1:
Alter Table comment_mining
Add Constraint DF_CommentMining_1 default NewID() for Ticket_UID
Make your insert explicitly include a generated uniqueidentifier (GUID) value by using the SQL NewID() function to populate the Ticket_UID UniqueIdentifier column
Example 2:
insert into dbo.comment_mining(Ticket_Uid, category_list)
SELECT NewID(),
[ your subquery ]...
In both cases, you're now satisfying the NOT NULL constraint on comment_mining.Ticket_UID, either by making it automatically populate itself, or by supplying a value.
try this,
;with cte as
(
select 100 Ticket_Uid,'ABC' Ticket_Category_Uid union all
select 100 , 'DEF' union all
select 100, 'XYZ'
)
select distinct b.Ticket_Uid,
stuff((select ','+a.Ticket_Category_Uid from cte a where a.Ticket_Uid=b.Ticket_Uid for xml path('')),1,1,'')
from cte b