I have two tables:
One called #settings with xml values
Another called #nodesToFind with a list of nodes to extract from the xml values in the first table
I want to get a list of the values for each NodePath for each RowId.
This query uses the sql:column function in the xpath of the values method on the Settings column but it returns the NodePath itself instead of the value:
declare #settings table (RowId int identity, Settings xml)
insert #settings (Settings) values ('<settings><Settings1><Settings1a><Setting1a1>1-1a1</Setting1a1></Settings1a><Setting1b>1-1b</Setting1b><Setting1c>1-1c</Setting1c></Settings1><Settings2><Setting2a>1-2a</Setting2a></Settings2></settings>')
insert #settings (Settings) values ('<settings><Settings1><Settings1a><Setting1a1>2-1a1</Setting1a1></Settings1a><Setting1b>2-1b</Setting1b><Setting1c>2-1c</Setting1c></Settings1><Settings2><Setting2a>2-2a</Setting2a></Settings2></settings>')
insert #settings (Settings) values ('<settings><Settings1><Settings1a><Setting1a1>3-1a1</Setting1a1></Settings1a><Setting1b>3-1b</Setting1b><Setting1c>3-1c</Setting1c></Settings1><Settings2><Setting2a>3-2a</Setting2a></Settings2></settings>')
declare #nodesToFind table (NodePath varchar(max))
insert #nodesToFind (NodePath) values ('/Settings/Settings1/Settings1a/Setting1a1')
insert #nodesToFind (NodePath) values ('/Settings/Settings1/Setting1b')
insert #nodesToFind (NodePath) values ('/Settings/Settings1/Setting1c')
insert #nodesToFind (NodePath) values ('/Settings/Settings2/Setting2a')
select
S.RowId,
NTF.NodePath,
S.Settings.value('(sql:column("NodePath"))[1]', 'varchar(max)')
from #settings S
cross apply #nodesToFind NTF
The result is this:
RowId NodePath Value
----- ----------------------------------------- -----------------------------------------
1 /Settings/Settings1/Settings1a/Setting1a1 /Settings/Settings1/Settings1a/Setting1a1
2 /Settings/Settings1/Settings1a/Setting1a1 /Settings/Settings1/Settings1a/Setting1a1
3 /Settings/Settings1/Settings1a/Setting1a1 /Settings/Settings1/Settings1a/Setting1a1
1 /Settings/Settings1/Setting1b /Settings/Settings1/Setting1b
2 /Settings/Settings1/Setting1b /Settings/Settings1/Setting1b
3 /Settings/Settings1/Setting1b /Settings/Settings1/Setting1b
1 /Settings/Settings1/Setting1c /Settings/Settings1/Setting1c
2 /Settings/Settings1/Setting1c /Settings/Settings1/Setting1c
3 /Settings/Settings1/Setting1c /Settings/Settings1/Setting1c
1 /Settings/Settings2/Setting2a /Settings/Settings2/Setting2a
2 /Settings/Settings2/Setting2a /Settings/Settings2/Setting2a
3 /Settings/Settings2/Setting2a /Settings/Settings2/Setting2a
What is wrong with the S.Settings.value('(sql:column("NodePath"))[1]', 'varchar(max)') line?
The XQuery you are trying to use is effectively being run dynaimcally. Unfortunately, you cannot use dynamic XQuery in SQL Server. Each XQuery must be static.
What you could do in your specific situation, is to break up each node predicate into separate columns. Then in the XQuery you can descend to the relevant node by checking each column. For example:
declare #settings table (RowId int identity, Settings xml)
insert #settings (Settings) values ('<settings><Settings1><Settings1a><Setting1a1>1-1a1</Setting1a1></Settings1a><Setting1b>1-1b</Setting1b><Setting1c>1-1c</Setting1c></Settings1><Settings2><Setting2a>1-2a</Setting2a></Settings2></settings>')
insert #settings (Settings) values ('<settings><Settings1><Settings1a><Setting1a1>2-1a1</Setting1a1></Settings1a><Setting1b>2-1b</Setting1b><Setting1c>2-1c</Setting1c></Settings1><Settings2><Setting2a>2-2a</Setting2a></Settings2></settings>')
insert #settings (Settings) values ('<settings><Settings1><Settings1a><Setting1a1>3-1a1</Setting1a1></Settings1a><Setting1b>3-1b</Setting1b><Setting1c>3-1c</Setting1c></Settings1><Settings2><Setting2a>3-2a</Setting2a></Settings2></settings>')
declare #nodesToFind table (NodePath1 nvarchar(max), NodePath2 nvarchar(max), NodePath3 nvarchar(max), NodePath4 nvarchar(max))
insert #nodesToFind (NodePath1, NodePath2, NodePath3, NodePath4) values ('settings','Settings1','Settings1a','Setting1a1')
insert #nodesToFind (NodePath1, NodePath2, NodePath3, NodePath4) values ('settings','Settings1','Setting1b',null)
insert #nodesToFind (NodePath1, NodePath2, NodePath3, NodePath4) values ('settings','Settings1','Setting1c',null)
insert #nodesToFind (NodePath1, NodePath2, NodePath3, NodePath4) values ('settings','Settings2','Setting2a',null)
select
S.RowId,
NTF.*,
S.Settings.value('((
for $i1 in *[local-name() = sql:column("NodePath1")]
return
if (empty(sql:column("NodePath2")))
then $i1
else for $i2 in ($i1/*[local-name() = sql:column("NodePath2")])
return
if (empty(sql:column("NodePath3")))
then $i2
else for $i3 in $i2/*[local-name() = sql:column("NodePath3")]
return
if (empty(sql:column("NodePath4")))
then $i3
else $i3/*[local-name() = sql:column("NodePath4")]
)/text())[1]', 'varchar(max)')
from #settings S
cross join #nodesToFind NTF
db<>fiddle
As you can see, it is made significantly more complex (and probably slow) by the fact that there are multiple possible node levels. If you can restrict it to only one level of node then you can remove the if else sections.
Related
I need to perform a row concatenation Operation in SQL Server, for those rows which all have the same Master_ID. Also, the resulted output order is based on the Seq_No Column.
As I am using an older version of SQL Server, I am unable to use STRING_AGG() function.
As of now, I am using Stuff and XML PATH functions to achieve the row concatenation, but I am unable to order the resulted data based on the Seq_No Column.
Table script:
DECLARE #T TABLE (Master_ID INT,
Associated_ID INT,
Class_ID INT,
Code VARCHAR(20),SEQ_No INT)
Insert into #T VALUES(1297232,NULL,3619202, '1101' ,1)
Insert into #T VALUES(1297232,NULL,3619202, '0813' ,2)
Insert into #T VALUES(1297232,NULL,3619202, '170219' ,3)
Insert into #T VALUES(1297232,NULL,3619202, '19053299',1)
Insert into #T VALUES(1297232,1297233,3619202,'1101' ,1)
Insert into #T VALUES(1297232,1297233,3619202,'0813' ,2)
Insert into #T VALUES(1297232,1297233,3619202,'170219' ,3)
Insert into #T VALUES(1297232,1297233,3619202,'19053299' ,1)
Insert into #T VALUES(1297232,1297234,3619202,'1101' ,1)
Insert into #T VALUES(1297232,1297234,3619202,'0813' ,2)
Insert into #T VALUES(1297232,1297234,3619202,'170219' ,3)
Insert into #T VALUES(1297232,1297234,3619202,'19053299' ,1)
Insert into #T VALUES(1297232,1297235,3619202,'1101' ,1)
Insert into #T VALUES(1297232,1297235,3619202,'0813' ,2)
Insert into #T VALUES(1297232,1297235,3619202,'170219' ,3)
Insert into #T VALUES(1297232,1297235,3619202,'19053299' ,1)
SELECT * FROM #T
The query I tried with error:
SELECT STUFF((SELECT DISTINCT' ,'+Code
FROM #T
ORDER by ISNULL(Associated_ID,Master_ID),SEQ_No -- Reason for Error
FOR XML PATH (''),TYPE).value('.', 'NVARCHAR(MAX)'), 1, 2, '')
Output for the above code:
0813 ,1101 ,170219 ,19053299
Expected output:
1101,19053299,0813,170219
You can swap DISTINCT for GROUP BY as they to the same and then you can order by aggregation functions like SUM or MAX
Example:
SELECT STUFF((SELECT ' ,'+Code
FROM #T
GROUP BY Code
ORDER by SUM(ISNULL(Associated_ID,Master_ID)),SUM(SEQ_No)
FOR XML PATH (''),TYPE).value('.', 'NVARCHAR(MAX)'), 1, 2, '')
-- OUTPUT: 1101 ,19053299 ,0813 ,170219
The error
ORDER BY items must appear in the select list if SELECT DISTINCT is specified.
has nothing to do with the stuff, but the fact you are trying to sort on a distinct. More info
I have an XML where the XML have multiple similar tag and I want this value need to show in one column with comma separator and insert into table.
For example:
<test xmlns="http://www.google.com">
<code>a</code>
<code>b</code>
<code>c</code>
</test>
Since XML is too large and I am using OPENXML to perform operation and insert that value into particular table.
I am performing like
insert into table A
(
code
)
select Code from OPENXML(sometag)
with (
code varchar(100) 'tagvalue'
)
for XQUERY I am using something like this: 'for $i in x:Code return concat($i/text()[1], ";")' and I want same with OPENXML.
Output: I want code tag value into one column like a,b,c or a/b/c.
Since you're on SQL Server 2017 you could use STRING_AGG (Transact-SQL) to concatenate your code values, e.g.:
create table dbo.Test (
someTag xml
);
insert dbo.Test (someTag) values
('<test><code>a</code><code>b</code><code>c</code></test>'),
('<test><code>d</code><code>e</code><code>f</code></test>');
select [Code], [someTag]
from dbo.Test
outer apply (
select [Code] = string_agg([value], N',')
from (
select n1.c1.value('.', 'nvarchar(100)')
from someTag.nodes(N'/test/code') n1(c1)
) src (value)
) a1;
Which yields...
Code someTag
a,b,c <test><code>a</code><code>b</code><code>c</code></test>
d,e,f <test><code>d</code><code>e</code><code>f</code></test>
Just a small tweak to AlwaysLearning (+1)
Example
Declare #YourTable table (ID int,XMLData xml)
insert Into #YourTable values
(1,'<test><code>a</code><code>b</code><code>c</code></test>')
Select A.ID
,B.*
From #YourTable A
Cross Apply (
Select DelimString = string_agg(xAttr.value('.','varchar(max)'),',')
From A.XMLData.nodes('/test/*') xNode(xAttr)
) B
Returns
ID DelimString
1 a,b,c
And just for completeness, here is method #3 via pure XQuery and FLWOR expression.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (ID INT IDENTITY PRIMARY KEY, xmldata xml);
INSERT #tbl (xmldata) VALUES
('<test xmlns="http://www.google.com"><code>a</code><code>b</code><code>c</code></test>'),
('<test xmlns="http://www.google.com"><code>d</code><code>e</code><code>f</code></test>');
-- DDL and sample data population, end
DECLARE #separator CHAR(1) = ',';
-- Method #3
-- SQL Server 2005 onwards
;WITH XMLNAMESPACES (DEFAULT 'http://www.google.com')
SELECT ID
, xmldata.query('for $i in /test/code
return if ($i is (/test/code[last()])[1]) then string($i)
else concat($i, sql:variable("#separator"))')
.value('.', 'NVARCHAR(MAX)') AS [Comma_separated_list]
FROM #tbl;
Output
+----+----------------------+
| ID | Comma_separated_list |
+----+----------------------+
| 1 | a, b, c |
| 2 | d, e, f |
+----+----------------------+
TableA
match / Keyword
0 Stackoverflow
1 Youtube
1 Google
0 Yandex
1 Twitter
0 Facebook
0 Teacher
Totally 10million rows in TableA
There is Clustered index at Keyword column
TableB
match / word
1 You
1 Go
1 Twit
0 Home
0 Car
0 Pencil
0 Money
0 Weather
0 Her
Totally 500 rows in TableB
There is Clustered index at word column
My Question
i want to make a sql query to match every word from TableB if matches in TableA keywords. And update the TableB.match with 1
(TableA.keyword like '+TableB.word+'%') (will be matched)
NOT the middle of the keyword matches; (TableA.keyword like '%'+TableB.word+'%')
Forexample Her -> in Teacher (wont be matched)
I Tried to use MERGE
First Try;
i tried to match keywords with words and update TableB
i get error, because there is multiple matches in TableA and MERGE do not allow updating multiple times a row in Target table (TableB)
MERGE INTO [TableB] As XB
USING (Select keyword FROM [TableA]) As XA
ON XB.word LIKE ''+XA.keyword+'%'
WHEN MATCHED THEN UPDATE SET XB.match=1;
Second Try;
i tried to match words with keywords and update TableA
i get what i want, The problem is, it takes 1 hour to execute the query for 500words in 10million keywords.
MERGE INTO [TableA] As XA
USING (Select word FROM [TableB]) As XB
ON XB.word LIKE ''+XA.keyword+'%'
WHEN MATCHED THEN UPDATE SET XA.match=1;
Is there an option to fasten these lookups in SecondTry?
An update statement will suffice for what you're trying to do. Note that this will probably not perform very well as SQL isn't great at comparing strings.
declare #a table (match int, keyword varchar(50))
declare #b table (match int, keyword varchar(50))
insert into #a values (0, 'Stackoverflow')
insert into #a values (0, 'Youtube')
insert into #a values (0, 'Google')
insert into #a values (0, 'Yandex')
insert into #a values (0, 'Twitter')
insert into #a values (0, 'Facebook')
insert into #a values (0, 'Teacher')
insert into #b values (0, 'You')
insert into #b values (0, 'Go')
insert into #b values (0, 'Twit')
insert into #b values (0, 'Home')
insert into #b values (0, 'Car')
insert into #b values (0, 'Pencil')
insert into #b values (0, 'Money')
insert into #b values (0, 'Weather')
insert into #b values (0, 'Her')
--commented out because user didn't want this, but it matches the provided data
--update #a
--set match = 1
--where keyword in
--(
-- select
-- distinct a.keyword
-- from #a a
-- cross apply #b b
-- where a.keyword like b.keyword + '%'
--)
update #b
set match = 1
where keyword in
(
select
distinct b.keyword
from #a a
cross apply #b b
where a.keyword like b.keyword + '%'
)
select *
from #a
select *
from #b
--EDIT BY Sean--
Here is how you could do this as a correlated subquery so you can use EXISTS.
update b
set match = 1
from #b b
where exists
(
select b.keyword
from #a a
where a.keyword like b.keyword + '%'
)
I have a Table name lines which has BillId (int) and LineReference (Varchar(100) as two columns. Each billid has LineReference value. However, value in the LineReference might not be correct. So i have to validate the LineReference from a variable which has already has correct Reference value based on the bill id.
Example :
Declare #iCountRef varchar(100) = 1,2,3
BillId LineReference
100 1,2,
100 1,2,40,34
100 1
100 12
From the above table, I need to update the LineReference column.
BillId LineReference
100 1,2
100 1,2
100 1
100 1
I would be able to update only by comparing with the variable : #iCountRef. LineReference column should have the values in the #iCountRef. Whatever values are not there in #CountRef should be removed. If there is no matching values,then the column should be updated atleast with number 1.
1) On medium or long term I would like to normalize this database in order to avoid such mistakes: storing list of values within string/VARCHAR columns. For example, I would use following many to many table:
CREATE TABLE dbo.BillItem (
ID INT IDENTITY(1,1) PRIMARY KEY,
BilldID INT NOT NOT NULL REFERENCES dbo.Bill(BilldID),
ItemID INT NOT NULL REFERENCES dbo.Item(ItemID),
UNIQUE (BillID, ItemID) -- Unique constraint created in order to prevent duplicated rows
);
In this case, one bill with two items means I have to insert two rows into dbo.BillItem table.
2) Back to original request: for one time task I would use XML and XQuery thus (this solution ends with a SELECT statement but it's trivial to convert into UPDATE):
DECLARE #iCountRef VARCHAR(100) = '1,2,3'
DECLARE #SourceTable TABLE (
BillId INT,
LineReference VARCHAR(8000)
)
INSERT #SourceTable (BillId, LineReference)
VALUES
(100, '1,2,'),
(100, '1,2,40,34'),
(100, '1'),
(100, '12')
DECLARE #iCountRefAsXML XML = CONVERT(XML, '<a><b>' + REPLACE(#iCountRef, ',', '</b><b>') + '</b></a>')
SELECT *, STUFF(z.LineReferenceAsXML.query('
for $i in (x/y)
for $j in (a/b)
where data(($i/text())[1]) eq data(($j/text())[1])
return concat(",", ($i/text())[1])
').value('.', 'VARCHAR(8000)'), 1, 1, '') AS NewLineReference
FROM (
SELECT *, CONVERT(XML,
'<x><y>' + REPLACE(LineReference, ',', '</y><y>') + '</y></x>' +
'<a><b>' + REPLACE(#iCountRef, ',', '</b><b>') + '</b></a>'
) AS LineReferenceAsXML
FROM #SourceTable s
) z
Results:
BillId LineReference NewLineReference LineReferenceAsXML
----------- ------------- ---------------- ------------------------------------------------------------------------
100 1,2, 1 ,2 <x><y>1</y><y>2</y><y /></x><a><b>1</b><b>2</b><b>3</b></a>
100 1,2,40,34 1 ,2 <x><y>1</y><y>2</y><y>40</y><y>34</y></x><a><b>1</b><b>2</b><b>3</b></a>
100 1 1 <x><y>1</y></x><a><b>1</b><b>2</b><b>3</b></a>
100 12 (null) <x><y>12</y></x><a><b>1</b><b>2</b><b>3</b></a>
--Create temp table and inserting data:
DECLARE #BillsRefs TABLE (
BillId int,
LineReference nvarchar(100)
)
INSERT INTO #BillsRefs VALUES
(100, '1,2,'),
(100, '1,2,40,34'),
(100, '1'),
(100, '12')
--Declare variables
DECLARE #iCountRef varchar(100) = '1,2,3',
#xml xml, #iXml xml
--Convert #iCountRef in XML
SELECT #iXml = CAST('<b>' + REPLACE(#iCountRef,',','</b><b>') + '</b>' as xml)
--#iXml:
--<b>1</b>
--<b>2</b>
--<b>3</b>
--Convert table with data in XML
SELECT #xml = (
SELECT CAST('<s id="'+LineReference+'"><a>' + REPLACE(LineReference,',','</a><a>') + '</a></s>' as xml)
FROM #BillsRefs
FOR XML PATH('')
)
--#xml:
--<s id="1,2,">
-- <a>1</a>
-- <a>2</a>
-- <a />
--</s>
--<s id="1,2,40,34">
-- <a>1</a>
-- <a>2</a>
-- <a>40</a>
-- <a>34</a>
--</s>
--<s id="1">
-- <a>1</a>
--</s>
--<s id="12">
-- <a>12</a>
--</s>
--Compare values from temp table to #iCountRef
--we convert string to xml - to convert them intoi tables
;WITH final AS (
SELECT DISTINCT
t.v.value('../#id','nvarchar(100)') as LineReferenceOld, -- #id to take 'id="1,2,40,34"' from xml above
CASE WHEN s.g.value('.','int') IS NULL THEN 1 ELSE s.g.value('.','int') END as LineReference
-- '.' is used to take value inside closed tags
FROM #xml.nodes('/s/a') as t(v) --we takes #xml (look above) and play with its nodes 's' (root for each #id) and `a`
LEFT JOIN #iXml.nodes('/b') as s(g) --we takes #iXml it has only 'b' tags
ON t.v.value('.','int') = s.g.value('.','int') --here we JOIN both xml by `a` and `b` tags
)
--In final table we get this:
--LineReferenceOld LineReference
--1,2, 2
--12 1
--1,2,40,34 1
--1,2,40,34 2
--1 1
--1,2, 1
--Final SELECT
SELECT c.BillId,
STUFF((SELECT DISTINCT ','+CAST(f.LineReference as nvarchar(10))
FROM final f
WHERE c.LineReference = f.LineReferenceOld
FOR XML PATH('')),1,1,'') as LineReference
FROM #BillsRefs c
Output:
BillId LineReference
100 1,2
100 1,2
100 1
100 1
If you need to update source table:
UPDATE c
SET LineReference = STUFF((SELECT DISTINCT ','+CAST(f.LineReference as nvarchar(10))
FROM final f
WHERE c.LineReference = f.LineReferenceOld
FOR XML PATH('')),1,1,'')
FROM #BillsRefs c
Here is an example statement to explain what I mean:
DECLARE #sourceTable table(ID int, tmstmp datetime, data varchar(max))
DECLARE #targetTable table(ID int, tmstmp datetime, data varchar(max))
INSERT INTO
#sourceTable
VALUES
(1, '2015-07-23T01:01:00', 'Testdata6')
,(1, '2015-07-23T02:02:00', 'Testdata7')
,(2, '2015-07-23T03:03:00', 'Testdata8')
,(2, '2015-07-23T04:04:00', 'Testdata9')
INSERT INTO
#targetTable
VALUES
(2, '2015-07-23T00:01:00', 'Testdata1')
,(2, '2015-07-23T00:02:00', 'Testdata2')
,(2, '2015-07-23T00:03:00', 'Testdata3')
,(3, '2015-07-23T00:04:00', 'Testdata4')
,(3, '2015-07-23T00:05:00', 'Testdata5')
MERGE INTO
#targetTable T
USING
#sourceTable S
ON
S.ID = T.ID
WHEN MATCHED THEN
DELETE
-- also want to INSERT newer ID 2 source records here after delete
WHEN NOT MATCHED THEN
INSERT (ID, tmstmp, data)
VALUES (S.ID, S.tmstmp, S.data)
;
When I make a select...
SELECT
*
FROM
#targetTable
...I get the following table:
ID tmstmp data
3 2015-07-23 00:04:00.000 Testdata4
3 2015-07-23 00:05:00.000 Testdata5
1 2015-07-23 01:01:00.000 Testdata6
1 2015-07-23 02:02:00.000 Testdata7
But I want to get the following table instead:
ID tmstmp data
3 2015-07-23 00:04:00.000 Testdata4
3 2015-07-23 00:05:00.000 Testdata5
1 2015-07-23 01:01:00.000 Testdata6
1 2015-07-23 02:02:00.000 Testdata7
2 2015-07-23 03:03:00.000 Testdata8
2 2015-07-23 04:04:00.000 Testdata9
How to realize this in one statement, because I use an extensive CTE for the source table.
Thanks in advance...
We can add some extra rows to the "source" table to take care of clearing out the existing rows, then let all of the current rows fall into the NOT MATCHED clause, which is the only one allowed to perform INSERT operations:
;With Clears as (
SELECT *,0 as Rem from #sourceTable
union all
select distinct ID,'1900-01-01','',1 from #sourceTable
)
MERGE INTO
#targetTable T
USING
Clears S
ON
S.ID = T.ID and s.Rem = 1
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED and Rem = 0 THEN
INSERT (ID, tmstmp, data)
VALUES (S.ID, S.tmstmp, S.data)
;
Fiddle
The basic rule with trying to achieve multiple operations within a MERGE statement is you need at least one source row for each action you want to take. It's then a challenge to formulate the ON clause and the various additional conditions after the WHEN clauses such that each operation applies when you want it to.
E.g. without the extra and Rem = 0 added to WHEN NOT MATCHED above, the extra row we added into Clears to remove any rows with ID of 1 would instead end up creating an extra row, since there aren't any ID 1 rows in the target table.
Wouldn't a simple DELETE-INSERT work here?
DELETE t FROM #targetTable t
WHERE EXISTS(
SELECT 1
FROM #sourceTable
WHERE ID = t.ID
)
INSERT INTO #targetTable(ID, tmstmp, data)
SELECT ID, tmstmp, data
FROM #sourceTable s
WHERE NOT EXISTS(
SELECT 1
FROM #targetTable
WHERE ID = s.ID
)
You may want to keep the two statements under one transaction.
EDIT: I just realized you wanted a single statement. But I'll leave it here as an alternate solution.