Xquery Get Position() - sql-server

This is a part of our xml file.
<point distanceTotal="162" seqNo="189">
<lineSection id="395" track="1" direction="1">
<outInfos>
<comment commentTypeId="4" priority="1"oneLiner="BOT">
<layerVPK seasonValue="S0"/>
<vectors>
<vector dateFrom="2016-12-11"/>
</vectors>
<frenchText>1x3 MH</frenchText>
</comment>
<comment commentTypeId="4" priority="1" oneLiner="bot">
<layerVPK seasonValue="S0"/>
<frenchText>Réception voie occupée</frenchText>
<dutchText>Test</dutchText>
</comment>
</outInfos>
</point>
We are uploading this to a SqlServer column and with XQuery we are fetching the values.
But, I can't find a way to get the position() coded, and basically T-SQL ROW_NUMBER or dense rank can't be used as not always all data exists.
As example the dutchText only exists on the second comment and there is no field that identifies the 2 comments....
This is the SQL Code
SELECT fi.file_uid,
fi.file_date,
T1.ref.value('#id', 'varchar(100)') AS gTV_id,
T2.ref.value('#id', 'varchar(100)') AS gTrn_id,
T4.ref.value('#seqNo', 'varchar(100)') AS gTrnTPp_seqNo,
T7.ref.value('text()[1]', 'varchar(1000)') AS gTrnTPpOiCDT_Text,
T6.ref.query('/globalTrainVariant/trains/globalTrainVariant/train/timetablePoints/point/outInfos/comment[position()]') AS Test
FROM ods.filesin fi
CROSS APPLY fi.file_xml.nodes('declare namespace cern="http://...";
(/cern:trains/globalTrainVariant)') T1(ref)
CROSS APPLY T1.ref.nodes('declare namespace cern="http://...";
(train)') T2(ref)
CROSS APPLY T2.ref.nodes('declare namespace cern="http://...";
(timetablePoints)') T3(ref)
CROSS APPLY T3.ref.nodes('declare namespace cern="http://...";
(point)') T4(ref)
CROSS APPLY T4.ref.nodes('declare namespace cern="http://...";
(outInfos)') T5(ref)
CROSS APPLY T5.ref.nodes('declare namespace cern="http://...";
(comment)') T6(ref)
CROSS APPLY T6.ref.nodes('declare namespace cern="http://...";
(dutchText)') T7(ref)
WHERE fi.file_type = 'trains'
The code gives no errors, but the Test field is always blank.
Any suggestions ?

If you would look up the documentation, you would see that, as of now, you can't return the result of the position() function directly:
In SQL Server, fn:position() can only be used in the context of a
context-dependent predicate. Specifically, it can only be used inside
brackets ([ ]).
However, there is a neat trick you can employ to get it. Namely, you can compare the position of the element with a known sequence and then return the matched value from that sequence. An example below illustrates that.
declare #x xml = N'<point distanceTotal="162" seqNo="189">
<outInfos>
<comment commentTypeId="4" priority="1" oneLiner="BOT">
<layerVPK seasonValue="S0" />
<vectors>
<vector dateFrom="2016-12-11" />
</vectors>
<frenchText>1x3 MH</frenchText>
</comment>
<comment commentTypeId="4" priority="1" oneLiner="bot">
<layerVPK seasonValue="S0" />
<frenchText>Réception voie occupée</frenchText>
<dutchText>Test</dutchText>
</comment>
</outInfos>
</point>';
with cte as (
select top (1000) row_number() over(order by ac.object_id) as [RN]
from sys.all_columns ac
)
select t.c.query('.') as [OutInfos], sq.RN as [TextPosition], x.c.query('.') as [DutchComment]
from #x.nodes('/point/outInfos') t(c)
cross join cte sq
cross apply t.c.nodes('./comment[position() = sql:column("sq.RN")]/dutchText') x(c);
In it, the CTE produces an ordered set of integers (I usually keep a special table around, but you can always construct one as you go), and match condition is specified in the XQuery expression that defines the x(c) output.

I agree with Roger that the position() fuction cannot be called directly and should be inside []. However, there is a solution which doesn't require any additional tables and supports any number of rows by using recursion:
declare #Xml xml = N'<?xml version="1.0" encoding="utf-16"?>
<root>
<n>1</n>
<n>10</n>
<n>5</n>
<n>3</n>
<n>11</n>
</root>';
with cte as
(
select t.c.value(N'n[1]', N'int') n, 1 RowNum
from #Xml.nodes(N'root[1]') t(c)
where t.c.exist(N'n[1]') = 1
union all
select t.c.value(N'n[position() = sql:column("cte.RowNum") + 1][1]', N'int') n, cte.RowNum + 1
from #Xml.nodes(N'root[1]') t(c)
cross join cte
where t.c.exist(N'n[position() = sql:column("cte.RowNum") + 1]') = 1
)
select *
from cte;

It might be simpler and better performing to use Node Order Comparison Operators in order to count the preceding //comment nodes in the XML tree.
I didn't test on huge XML documents, but it's definitely less I/O intensive, and was less CPU intensive on my contrived tests.
declare #x xml = N'<point distanceTotal="162" seqNo="189">
<outInfos>
<comment commentTypeId="4" priority="1" oneLiner="BOT">
<layerVPK seasonValue="S0" />
<vectors>
<vector dateFrom="2016-12-11" />
</vectors>
<frenchText>1x3 MH</frenchText>
</comment>
<comment commentTypeId="4" priority="1" oneLiner="bot">
<layerVPK seasonValue="S0" />
<frenchText>Réception voie occupée</frenchText>
<dutchText>Test</dutchText>
</comment>
</outInfos>
</point>';
select
[OutInfos] = t.c.query('../..'),
[TextPosition] = t.c.value('let $dutchText := . return count(../../comment[. << $dutchText])', 'int'),
[DutchComment] = t.c.query('.')
from #x.nodes('/point/outInfos/comment/dutchText') t(c)

Related

Find out deadlock reason

We have NUnit project witn tests and many cases for each test. All of them run in parallel mode.
I found 2 deadlock reasons. 1 - is table, but second one - no name or some id.
How to find this object?
just for reference We have gybrid mode in app ef context + dapper for difficult sql
Extract the XML of the deadlock graph and use the script I give to you to extract SQL text to undrerstand what's happening...
DECLARE #XML XML = N'??? my deadlock XML !!!';
WITH
TX AS
(
SELECT #XML AS TextData
),
TVM AS
(
SELECT v.value('(./inputbuf)[1]','nvarchar(max)') AS Query,
i.value('(./deadlock/#victim)[1]','varchar(32)') AS ProcessVictim,
v.value('(./#id)[1]','varchar(32)') AS ProcessID
FROM TX
CROSS APPLY TextData.nodes('/deadlock-list') AS X(i)
CROSS APPLY TextData.nodes('/deadlock-list/deadlock/process-list/process') AS V(v)
),
TVV AS
(
SELECT DENSE_RANK() OVER (ORDER BY StartTime) AS ID,
Query,
CASE WHEN ProcessVictim = ProcessID THEN 'Victim!' ELSE 'Alive' END AS FinalState
FROM TVM
),
TQV AS
(
SELECT ID, Query
FROM TVV
WHERE FinalState = 'Victim!'
)
SELECT TVV.ID, TVV.Query,
CASE WHEN TQV.ID IS NOT NULL THEN 'Victim!' ELSE 'Alive' END AS FinalState
FROM TVV
LEFT OUTER JOIN TQV
ON TVV.ID = TQV.ID AND TVV.Query = TQV.Query
ORDER BY 1;
The result will be like this :

How do I query repeating XML child nodes inside parent repeating nodes?

I have the following XML stored in an nText column (not my design, older database). I need to pull the PolicyNumber and the CvgCode that is a property of Coverage child node.
<efs:Request
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:efs="http://www.slsot.org/efs"
xsi:schemaLocation="http://www.slsot.org/efs
http://efs.slsot.org/efs/xsd/SlsotEfsSchema2.xsd">
<EfsVersion>2.0</EfsVersion>
<Batch BatchType="N" AgLicNo="12345" ItemCnt="69">
<EFSPolicy>
<PolicyNumber>POL12345</PolicyNumber>
<Binder>0086592YZ</Binder>
<TransType>N</TransType>
<Insured>Dummy Co LLC</Insured>
<ZipCode>75225</ZipCode>
<ClassCd>99930</ClassCd>
<PolicyFee>35.00</PolicyFee>
<TotalTax>36.62</TotalTax>
<TotalStampFee>1.13</TotalStampFee>
<TotalGross>792.75</TotalGross>
<EffectiveDate>09/17/2018</EffectiveDate>
<ExpirationDate>09/17/2019</ExpirationDate>
<IssueDate>09/20/2018</IssueDate>
<ContUntilCancl>N</ContUntilCancl>
<FedCrUnion>N</FedCrUnion>
<AORFlag>N</AORFlag>
<CustomID>043684</CustomID>
<WindStormExclusion>N</WindStormExclusion>
<CorrectionReEntry>N</CorrectionReEntry>
<Coverages>
<Coverage CvgCode="9325">720.00</Coverage>
</Coverages>
<Securities>
<Company CoNumber="80101168">100.00</Company>
</Securities>
</EFSPolicy>
<EFSPolicy>
...
</EFSPolicy>
</Batch>
</efs:Request>
And here is the SQL code I am using to extract the PolicyNumber (so far).
with cte_table(BatchID, xmlData)
AS
(SELECT BatchID, CAST(CAST(xmlData AS VARCHAR(MAX)) AS XML) from
Batches)
select
s.BatchID
,t.c.value('PolicyNumber[1]', 'varchar(max)') as PolicyNumber
from cte_table as s
cross apply s.xmlData.nodes('/*:Request/Batch/EFSPolicy') as t(c)
where BatchID in (select batchID from Batches where CreateDate between '1/1/19' and getdate())
I have tried a second CROSS APPLY on the Coverages node, but that was giving me all the Coverage values (not CvgCode property) for every batch, so my result set was 100+ times too many rows. I assume that was due to the 2nd CROSS APPLY, is there a INNER JOIN type CROSS APPLY?
You need to declare your namespaces to retrieve the data:
DECLARE #XML xml = '<efs:Request
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:efs="http://www.slsot.org/efs"
xsi:schemaLocation="http://www.slsot.org/efs
http://efs.slsot.org/efs/xsd/SlsotEfsSchema2.xsd">
<EfsVersion>2.0</EfsVersion>
<Batch BatchType="N" AgLicNo="12345" ItemCnt="69">
<EFSPolicy>
<PolicyNumber>POL12345</PolicyNumber>
<Binder>0086592YZ</Binder>
<TransType>N</TransType>
<Insured>Dummy Co LLC</Insured>
<ZipCode>75225</ZipCode>
<ClassCd>99930</ClassCd>
<PolicyFee>35.00</PolicyFee>
<TotalTax>36.62</TotalTax>
<TotalStampFee>1.13</TotalStampFee>
<TotalGross>792.75</TotalGross>
<EffectiveDate>09/17/2018</EffectiveDate>
<ExpirationDate>09/17/2019</ExpirationDate>
<IssueDate>09/20/2018</IssueDate>
<ContUntilCancl>N</ContUntilCancl>
<FedCrUnion>N</FedCrUnion>
<AORFlag>N</AORFlag>
<CustomID>043684</CustomID>
<WindStormExclusion>N</WindStormExclusion>
<CorrectionReEntry>N</CorrectionReEntry>
<Coverages>
<Coverage CvgCode="9325">720.00</Coverage>
</Coverages>
<Securities>
<Company CoNumber="80101168">100.00</Company>
</Securities>
</EFSPolicy>
<EFSPolicy>
</EFSPolicy>
</Batch>
</efs:Request>';
WITH XMLNAMESPACES('http://www.w3.org/2001/XMLSchema-instance' as xsi,
'http://www.slsot.org/efs' AS efs)
SELECT EFS.[Policy].value('(./PolicyNumber/text())[1]','varchar(25)') AS PolicyNumber,
EFS.[Policy].value('(./Coverages/Coverage/#CvgCode)[1]','int') AS CvgCode --Assumes only 1 CvgCode per policy
FROM (VALUES(#XML)) V(X)
CROSS APPLY V.X.nodes('efs:Request/Batch/EFSPolicy') EFS([Policy]);

How to query XML with namespace in T-SQL?

I'm having difficulty querying this XML with a namespace. I can query the xml without the namespace fine.
Below is my attempt. It results in 0 records.
;WITH XMLNAMESPACES ('http://www.google.com/kml/ext/2.2' as gx)
,CTE AS
( SELECT CONVERT(XML,'<?xml version=''1.0'' encoding=''UTF-8''?>
<kml xmlns=''http://www.opengis.net/kml/2.2'' xmlns:gx=''http://www.google.com/kml/ext/2.2''>
<Document>
<Placemark>
<open>1</open>
<gx:Track>
<altitudeMode>clampToGround</altitudeMode>
<when>2017-10-26T11:42:05Z</when>
<gx:coord>Lat Long Altitude</gx:coord>
<when>2017-10-26T11:41:40Z</when>
<gx:coord>Lat Long Altitude</gx:coord>
</gx:Track>
</Placemark>
</Document>
</kml>'
) AS BulkColumnXML
)
SELECT altitudeModetext.node.value('.','NVARCHAR(255)') AS altitudeMode,
gdcoordtext.node.value('.','NVARCHAR(255)') AS gdcoord,
whentext.node.value('.','NVARCHAR(255)') AS [when]
FROM CTE
CROSS APPLY BulkColumnXML.nodes('/kml/Document/Placemark/gx:Track') as kmlDocumentPlacemarkopengxtrack(node)
CROSS APPLY kmlDocumentPlacemarkopengxtrack.node.nodes('altitudeMode/text()') as altitudeModetext(node)
CROSS APPLY kmlDocumentPlacemarkopengxtrack.node.nodes('gx:coord/text()') as gdcoordtext(node)
CROSS APPLY kmlDocumentPlacemarkopengxtrack.node.nodes('when/text()') as whentext(node)
Corrected Code by adding default namespace into with namespaces clause:
;WITH XMLNAMESPACES ('http://www.google.com/kml/ext/2.2' as gx,
DEFAULT 'http://www.opengis.net/kml/2.2')
,CTE AS
( SELECT CONVERT(XML,'<?xml version=''1.0'' encoding=''UTF-8''?>
<kml xmlns=''http://www.opengis.net/kml/2.2'' xmlns:gx=''http://www.google.com/kml/ext/2.2''>
<Document>
<Placemark>
<open>1</open>
<gx:Track>
<altitudeMode>clampToGround</altitudeMode>
<when>2017-10-26T11:42:05Z</when>
<gx:coord>Lat Long Altitude</gx:coord>
<when>2017-10-26T11:41:40Z</when>
<gx:coord>Lat Long Altitude</gx:coord>
</gx:Track>
</Placemark>
</Document>
</kml>'
) AS BulkColumnXML
)
SELECT altitudeModetext.node.value('.','NVARCHAR(255)') AS altitudeMode,
gdcoordtext.node.value('.','NVARCHAR(255)') AS gdcoord,
whentext.node.value('.','NVARCHAR(255)') AS [when]
FROM CTE
CROSS APPLY BulkColumnXML.nodes('/kml/Document/Placemark/gx:Track') as kmlDocumentPlacemarkopengxtrack(node)
CROSS APPLY kmlDocumentPlacemarkopengxtrack.node.nodes('altitudeMode/text()') as altitudeModetext(node)
CROSS APPLY kmlDocumentPlacemarkopengxtrack.node.nodes('gx:coord/text()') as gdcoordtext(node)
CROSS APPLY kmlDocumentPlacemarkopengxtrack.node.nodes('when/text()') as whentext(node)
My magic crystal ball tells me, that you might be looking for something like this:
;WITH XMLNAMESPACES ('http://www.google.com/kml/ext/2.2' as gx,
DEFAULT 'http://www.opengis.net/kml/2.2')
,CTE AS
( SELECT CONVERT(XML,'<?xml version=''1.0'' encoding=''UTF-8''?>
<kml xmlns=''http://www.opengis.net/kml/2.2'' xmlns:gx=''http://www.google.com/kml/ext/2.2''>
<Document>
<Placemark>
<open>1</open>
<gx:Track>
<altitudeMode>clampToGround</altitudeMode>
<when>2017-10-26T11:42:05Z</when>
<gx:coord>Lat Long Altitude</gx:coord>
<when>2017-10-26T11:41:40Z</when>
<gx:coord>Lat Long Altitude</gx:coord>
</gx:Track>
</Placemark>
</Document>
</kml>'
) AS BulkColumnXML
)
,intermediateCTE AS
(
SELECT CTE.BulkColumnXML.value('(/kml/Document/Placemark/open/text())[1]','NVARCHAR(255)') AS placemark_open,
CTE.BulkColumnXML.value('(/kml/Document/Placemark/gx:Track/altitudeMode/text())[1]','nvarchar(255)') AS AltitudeMode,
CTE.BulkColumnXML.query('/kml/Document/Placemark/gx:Track/*[local-name()!="altitudeMode"]') AS SubTree
FROM CTE
)
,AllWhens AS
(
SELECT intermediateCTE.placemark_open
,intermediateCTE.AltitudeMode
,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS WhenIndex
,whn.value('text()[1]','datetime') AS WhenValue
FROM intermediateCTE
CROSS APPLY SubTree.nodes('/*:when') AS A(whn)
)
,AllCoords AS
(
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS CoordIndex
,crd.value('text()[1]','varchar(255)') AS CoordValue
FROM intermediateCTE
CROSS APPLY SubTree.nodes('/*:coord') AS A(crd)
)
SELECT AllWhens.*
,AllCoords.CoordValue
FROM AllWhens
INNER JOIN AllCoords ON WhenIndex=CoordIndex
The result
AltitudeMode Inx WhenValue CoordValue
----------------------------------------------------------------
1 clampToGround 2 2017-10-26 11:41:40.000 Lat Long Altitude
1 clampToGround 1 2017-10-26 11:42:05.000 Lat Long Altitude

Right Index scan when used XML

below is T-SQL generated by application
DECLARE #xml_0 XML
SET #xml_0 = N'<Val>8e4cd3e3-de98-4f55-9c55-57881157a0f0</Val>
<Val>2f483275-7333-4786-aca8-454e5bf4823f</Val>
<Val>ce1ce763-1f68-48ec-bedf-f4641e40d8f8</Val>
<Val>6b471d5e-fd5c-4db8-aa31-abb910651e18</Val>
<Val>89064e42-0592-4845-b21e-38f788ab0d2e</Val>
<Val>d54793f0-cbfb-428e-ba08-db70cab1af07</Val>
<Val>8027e6bd-09e5-4a5b-aae7-54aff4a0e6c0</Val>
<Val>53f1a5e3-b2a8-49c3-935b-a5ac7fe0c1d8</Val>
<Val>faceabad-1d0c-4f3f-8d94-674bbf1c3428</Val>
<Val>f8e0a43d-cff7-45aa-b73f-6858b1d17cd1</Val>
<Val>94e9bc76-5bb3-4cf9-9b59-fc3163c904d7</Val>
<Val>e4be8c69-5166-40cc-b49a-18adec78e356</Val>
<Val>5c564b82-64e1-46c5-a41d-bc30104f14a5</Val>
<Val>dc246c2c-7edd-407a-b378-747789bd5a75</Val>
<Val>411ac1e9-3d4f-447c-808a-b82d388816dd</Val>'
SELECT COUNT(*) FROM
(
SELECT [t0].[ID]
FROM [dbo].[HM_Rows] AS [t0], [dbo].[HM_Cells] AS [t1]
WHERE
(
[t1].[Value] IN
(
SELECT node.value('.', 'NVARCHAR(200)') FROM #xml_0.nodes('/Val') xml_0(node)
)
)
) AS [r2487772634]
here is execution plan of T-SQL above
so it scans index
it scans correct indexes
missing_index_FOR_Value_INC_RowID - on HM_Cells table
and
PK_HM_Rows - on HM_Rows table
any idea?
P.S tables are large
Row counts
HM_Rows - 17'736'181
HM_Cells - 1'048'693'775
AND YES i have rebuilded indexes and updated statistics
HM_Cells.Value is NVarChar(200)
also without XML and HM_Rows table it working fine
e.g
SELECT ID FROM HM_Cells WHERE Value IN (.........)
works excellent
Thanks a lot :)
Try using a JOIN instead of an IN, because this way you can force a loop strategy, which will probably use a seek instead of a scan:
SELECT COUNT(*) FROM
(
SELECT [t0].[ID]
FROM (
SELECT DISTINCT node.value('.', 'NVARCHAR(200)') AS Val
FROM #xml_0.nodes('/Val') xml_0(node)
) q1 INNER LOOP JOIN [dbo].[HM_Cells] AS [t1] ON q1.Val=t1.Value
CROSS JOIN [dbo].[HM_Rows] AS [t0]
) AS [r2487772634]

SQL Server and XML query

Suddenly I cannot find the right way or hint to the solution anywhere, so trying there:
I have a select like this:
declare #XmlOutput xml
set #XmlOutput = (
SELECT
R.ID
,R.[PN] as 'Nummer'
,R.[TitlePrefixAbbreviation] as Title
,R.[FirstName]
,R.LastName
,RA.[DescriptiveNumber]
,RA.[OrientationalNumber]
,RC.Contact
FROM [tbl1] as R
left join tbl2 as RA on R.ID = RA.[RID]
left join [tbl3 as RC on R.ID = RC.[RID]
for xml auto, ROOT('mbox'), ELEMENTS
)
select #XmlOutput
and the result looks like this:
<mbox>
<R>
<ID>66284</ID>
<Nummer>999999</Nummer>
<Title />
<FirstName>test</FirstName>
<LastName>test</LastName>
<RA>
<HouseNr>9999</HouseNr>
<SequenceNr />
<City>London</City>
<ZIP>99999</ZIP>
<RC>
<Contact>letitroll#gmail.com</Contact>
</RC>
</RA>
</R>
<mbox>
As you can see, there are hierarchy elements RA, R, RC that depend on joining tables.
And I cannot find out how to make the XML only with root element <mbox> and without sub tree like R, RC, RA
Something like this ->
<mbox>
<ID>66284</ID>
<Nummer>999999</Nummer>
<Title />
<FirstName>Štěpánka</FirstName>
<LastName>Solomková</LastName>
<HouseNr>2015</HouseNr>
<SequenceNr />
<City>London</City>
<ZIP>99999</ZIP>
<Contact>letitroll#gmail.com</Contact>
<mbox>
I think about a workaround by using temporary table, where i firstly put all selected data, and the generate XML from TMP table, but I hope that there is more elegant way.
Can somebody help ?
Bests,
gelo
declare #XmlOutput xml
set #XmlOutput = (
Select * from
(
SELECT
R.ID
,R.[PN] as 'Nummer'
,R.[TitlePrefixAbbreviation] as Title
,R.[FirstName]
,R.LastName
,RA.[DescriptiveNumber]
,RA.[OrientationalNumber]
,RC.Contact
FROM [tbl1] as R
left join tbl2 as RA on R.ID = RA.[RID]
left join [tbl3 as RC on R.ID = RC.[RID]
) mbox
for xml auto, ELEMENTS
)
select #XmlOutput

Resources