I'm trying to extract the data from SQL server execution plans in a generic way.
As an example the execution plan for
SELECT *
FROM sys.all_objects o1
as shown in SSMS is below
The UI shows nodes along with costs for each node and percentages. How can I extract this from the underlying XML into a table structure?
I've tried to query the XML by my self, but it seems that the XML structure is changing from query to query.
This should get you started (DB Fiddle example).
DECLARE #X XML = N'<?xml version="1.0" encoding="utf-16"?><ShowPlanXML ...';
DECLARE #Nodes TABLE
(
PlanId INT,
NodeId INT,
PhysicalOp VARCHAR(200),
EstimatedTotalSubtreeCost FLOAT,
EstimatedOperatorCost FLOAT,
ParentNodeId INT NULL,
PRIMARY KEY(PlanId, NodeId)
);
WITH XMLNAMESPACES (default 'http://schemas.microsoft.com/sqlserver/2004/07/showplan'),
plans AS
(
SELECT ROW_NUMBER() over (order by qp) as PlanId, qp.query('.') as plan_xml
FROM #X.nodes('//QueryPlan') n(qp)
)
INSERT #Nodes(PlanId, NodeId, PhysicalOp, EstimatedTotalSubtreeCost, ParentNodeId)
SELECT PlanId,
NodeId = relop.value('#NodeId', 'int'),
PhysicalOp = relop.value('#PhysicalOp', 'varchar(200)'),
EstimatedTotalSubtreeCost = relop.value('#EstimatedTotalSubtreeCost', 'float'),
/*XPath ancestor axis not supported so just go up a few levels and look for the closest ancestor Relop*/
ParentNodeId = COALESCE(
relop.value('..[local-name() = "RelOp"]/#NodeId', 'int'),
relop.value('../..[local-name() = "RelOp"]/#NodeId', 'int'),
relop.value('../../..[local-name() = "RelOp"]/#NodeId', 'int'),
relop.value('../../../..[local-name() = "RelOp"]/#NodeId', 'int')
)
FROM plans
CROSS APPLY plan_xml.nodes('//RelOp') n(relop);
UPDATE N1
SET EstimatedOperatorCost = EstimatedTotalSubtreeCost - ISNULL((SELECT SUM(EstimatedTotalSubtreeCost) FROM #Nodes N2 WHERE N1.PlanId = N2.PlanId AND N2.ParentNodeId = N1.NodeId),0)
FROM #Nodes N1
SELECT *,
EstPctOperatorCost = FORMAT(EstimatedOperatorCost/MAX(EstimatedTotalSubtreeCost) OVER (PARTITION BY PlanId), 'P0')
FROM #Nodes
The execution plan is a tree - there are likely more elegant ways of getting the parent operator than my attempt!
The above is not battle tested across a sample size of more than two execution plans so you may well encounter issues with it that you will need to fix.
You can visit the URI http://schemas.microsoft.com/sqlserver/2004/07/showplan to see information about the various schemas though for some reason I've never got to the bottom of it displays "The request is blocked." for me unless I use incognito mode.
Related
I've got 34 rows in a database, each row has a column containing xml - the xml is actually in an NVARCHAR(MAX) column not an XML column.
For each row I am selecting values in the xml elements as a single resultset. The performance is pretty poor. I've tried two different queries. The first takes roughly 22 seconds to execute and the second takes 7.
Even at 7 seconds, this is far slower than optimal, I'm hoping for 1-2 seconds at most.
So then I read a rumor online that if you convert the NVARCHAR data to a XML using a temp table or table variable, you will achieve a performance gain, which at least in my case was true... It now executes in under a second. What I'm looking for now is an explanation that can tell my why these 2 approaches actually affect performance.
22 seconds:
SELECT
c.ID,
c.ChannelName,
[Name] = d.c.value('name[1]','varchar(100)'),
[Type] = d.c.value('transportName[1]','varchar(100)'),
[Enabled] = d.c.value('enabled[1]','BIT'),
[Queued] = d.c.value('properties[1]/destinationConnectorProperties[1]/queueEnabled[1]','varchar(100)'),
[RetryInterval] = d.c.value('properties[1]/destinationConnectorProperties[1]/retryIntervalMillis[1]','INT'),
[MaxRetries] = d.c.value('properties[1]/destinationConnectorProperties[1]/retryCount[1]','INT'),
[RotateQueue] = d.c.value('properties[1]/destinationConnectorProperties[1]/rotate[1]','BIT'),
[ThreadCount] = d.c.value('properties[1]/destinationConnectorProperties[1]/threadCount[1]','INT'),
[WaitForPrevious] = d.c.value('waitForPrevious[1]','BIT'),
[Destination] = COALESCE(
d.c.value('properties[1]/channelId[1]','varchar(100)'),
d.c.value('properties[1]/remoteAddress[1]','varchar(100)'),
d.c.value('properties[1]/wsdlUrl[1]','varchar(1024)')),
[DestinationPort] = COALESCE(
d.c.value('properties[1]/remotePort[1]','varchar(100)'),
d.c.value('properties[1]/port[1]','varchar(1024)')),
[Service] = d.c.value('properties[1]/service[1]','varchar(1024)'),
[Operation] = d.c.value('properties[1]/operation[1]','varchar(1024)')
FROM
(
SELECT
[ID],
[ChannelName] = [Name],
[CFG] = Convert(XML, Channel)
FROM
dbo.CHANNEL
) c
CROSS APPLY c.CFG.nodes('/channel/destinationConnectors/connector') d(c)
7 seconds, due to use of text(). I have no idea why text speeds things up.
SELECT
c.ID,
c.ChannelName,
[Name] = d.c.value('(name/text())[1]','varchar(100)'),
[Type] = d.c.value('(transportName/text())[1]','varchar(100)'),
[Enabled] = d.c.value('(enabled/text())[1]','BIT'),
[Queued] = d.c.value('(properties/destinationConnectorProperties/queueEnabled/text())[1]','varchar(100)'),
[RetryInterval] = d.c.value('(properties/destinationConnectorProperties/retryIntervalMillis/text())[1]','INT'),
[MaxRetries] = d.c.value('(properties/destinationConnectorProperties/retryCount/text())[1]','INT'),
[RotateQueue] = d.c.value('(properties/destinationConnectorProperties/rotate/text())[1]','BIT'),
[ThreadCount] = d.c.value('(properties/destinationConnectorProperties/threadCount/text())[1]','INT'),
[WaitForPrevious] = d.c.value('(waitForPrevious/text())[1]','BIT'),
[Destination] = COALESCE(
d.c.value('(properties/channelId/text())[1]','varchar(100)'),
d.c.value('(properties/remoteAddress/text())[1]','varchar(100)'),
d.c.value('(properties/wsdlUrl/text())[1]','varchar(1024)')),
[DestinationPort] = COALESCE(
d.c.value('(properties/remotePort/text())[1]','varchar(100)'),
d.c.value('(properties/port/text())[1]','varchar(1024)')),
[Service] = d.c.value('(properties/service/text())[1]','varchar(1024)'),
[Operation] = d.c.value('(properties/operation/text())[1]','varchar(1024)')
FROM
(
SELECT
[ID],
[ChannelName] = [Name],
[CFG] = Convert(XML, Channel)
FROM
dbo.CHANNEL
) c
CROSS APPLY c.CFG.nodes('/channel/destinationConnectors/connector') d(c)
This query uses the text() approach but puts converts the NVARCHAR column to xml column in a table variable first. Executes in less than a second...
DECLARE #Xml AS TABLE (
[ID] NVARCHAR(36) NOT NULL Primary Key,
[Name] NVARCHAR(100) NOT NULL,
[CFG] XML NOT NULL
);
INSERT INTO #Xml (ID, Name, CFG)
SELECT
c.ID,
c.Name,
Convert(XML, c.Channel)
FROM
[dbo].[CHANNEL] c;
SELECT
c.ID,
c.ChannelName,
[Name] = d.c.value('(name/text())[1]','varchar(100)'),
[Type] = d.c.value('(transportName/text())[1]','varchar(100)'),
[Enabled] = d.c.value('(enabled/text())[1]','BIT'),
[Queued] = d.c.value('(properties/destinationConnectorProperties/queueEnabled/text())[1]','varchar(100)'),
[RetryInterval] = d.c.value('(properties/destinationConnectorProperties/retryIntervalMillis/text())[1]','INT'),
[MaxRetries] = d.c.value('(properties/destinationConnectorProperties/retryCount/text())[1]','INT'),
[RotateQueue] = d.c.value('(properties/destinationConnectorProperties/rotate/text())[1]','BIT'),
[ThreadCount] = d.c.value('(properties/destinationConnectorProperties/threadCount/text())[1]','INT'),
[WaitForPrevious] = d.c.value('(waitForPrevious/text())[1]','BIT'),
[Destination] = COALESCE(
d.c.value('(properties/channelId/text())[1]','varchar(100)'),
d.c.value('(properties/remoteAddress/text())[1]','varchar(100)'),
d.c.value('(properties/wsdlUrl/text())[1]','varchar(1024)')),
[DestinationPort] = COALESCE(
d.c.value('(properties/remotePort/text())[1]','varchar(100)'),
d.c.value('(properties/port/text())[1]','varchar(1024)')),
[Service] = d.c.value('(properties/service/text())[1]','varchar(1024)'),
[Operation] = d.c.value('(properties/operation/text())[1]','varchar(1024)')
FROM
(
SELECT
[ID],
[ChannelName] = [Name],
[CFG]
FROM
#Xml
) c
CROSS APPLY c.CFG.nodes('/channel/destinationConnectors/connector') d(c)
I can give you one answer and one guess:
First I use a declared table variable to mock up your scenario:
DECLARE #tbl TABLE(s NVARCHAR(MAX));
INSERT INTO #tbl VALUES
(N'<root>
<SomeElement>This is first text of element1
<InnerElement>This is text of inner element1</InnerElement>
This is second text of element1
</SomeElement>
<SomeElement>This is first text of element2
<InnerElement>This is text of inner element2</InnerElement>
This is second text of element2
</SomeElement>
</root>')
,(N'<root>
<SomeElement>This is first text of elementA
<InnerElement>This is text of inner elementA</InnerElement>
This is second text of elementA
</SomeElement>
<SomeElement>This is first text of elementB
<InnerElement>This is text of inner elementB</InnerElement>
This is second text of elementB
</SomeElement>
</root>');
--This query will read the XML with a cast out of a sub-select. You might use a CTE instead, but this should be syntactical sugar only...
SELECT se.value(N'(.)[1]','nvarchar(max)') SomeElementsContent
,se.value(N'(InnerElement)[1]','nvarchar(max)') InnerElementsContent
,se.value(N'(./text())[1]','nvarchar(max)') ElementsFirstText
,se.value(N'(./text())[2]','nvarchar(max)') ElementsSecondText
FROM (SELECT CAST(s AS XML) FROM #tbl) AS tbl(TheXml)
CROSS APPLY TheXml.nodes(N'/root/SomeElement') AS A(se);
--The second part uses a table to write in the typed XML and read from there:
DECLARE #tbl2 TABLE(x XML)
INSERT INTO #tbl2
SELECT CAST(s AS XML) FROM #tbl;
SELECT se.value(N'(.)[1]','nvarchar(max)') SomeElementsContent
,se.value(N'(InnerElement)[1]','nvarchar(max)') InnerElementsContent
,se.value(N'(./text())[1]','nvarchar(max)') ElementsFirstText
,se.value(N'(./text())[2]','nvarchar(max)') ElementsSecondText
FROM #tbl2 t2
CROSS APPLY t2.x.nodes(N'/root/SomeElement') AS A(se);
Why is /text() faster than without /text()?
If you look at my example, the content of an element is everything from the opening tag down to the closing tag. The text() of an element is the floating text between these tags. You can see this in the results of the select above. The text() is one separately stored portion in a tree structure actually (read next section). To fetch it, is a one-step-action. Otherwise a complex structure has to be analysed to find everything between the opening tag and its corresponding closing tag - even if there is nothing else than the text().
Why should I store XML in the appropriate type?
XML is not just text with some silly extra characters! It is a document with a complex structure. The XML is not stored as the text you see. XML is stored in a tree structure. Whenever you cast a string, which represents an XML, into a real XML, this very expensive work must be done. When the XML is presented to you (or any other output) the representing string is (re)built from scratch.
Why is the pre-casted approach faster
This is guessing...
In my example both approaches are quite equal and lead to (almost) the same execution plan.
SQL Server will not work down everything the way you might expect this. This is not a procedural system where you state do this, than do this and after do this!. You tell the engine what you want, and the engine decides how to do this best. And the engine is pretty good with this!
Before execution starts, the engine tries to estimate the costs of approaches. CONVERT (or CAST) is a rather cheap operation. It could be, that the engine decides to work down the list of your calls and do the cast for each single need over and over, because it thinks, that this is cheaper than the expensive creation of a derived table...
I have the following xml which is being parsed via a MSSQL database using OPENXML with an xquery filter to grab the right rows. Unfortunately, it doesn't seem to grab the appropriate rows, which has me scratching my head.
Using the following XML, I only want to insert the single email address where the Method="Insert", and ignore the remaining two addresses where Method is not present or has another value (which were previously inserted).
<Entities xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" ActiveEntityID="0">
<Entity_Businesses>
<Entity_Business EntityTypeID="5" EntityRoleTypeID="9" Method="Update" Name="test business 76" EIN="" EmployeeCount="75" TotalAssets="750000.00">
<Entity_Emails>
<Entity_Email ID="85" EmailAddress="jones#company.com" />
<Entity_Email ID="0" EmailAddress="smith#company.com" Method="Insert"/>
</Entity_Emails>
<Entity_Contacts>
<Entity_Contact ID="162" EntityTypeID="4" EntityRoleTypeID="9" FName="Joe" MName="k" LName="Smith" SSN="444-44-444" JobTitleID="0" DOB="2007-02-27T00:00:00">
<Entity_Emails>
<Entity_Email ID="86" EmailAddress="individual#test.com"/>
</Entity_Emails>
</Entity_Contact>
</Entity_Contacts>
<Entity_Business>
</Entity_Businesses>
</Entities>
I am using this sql statement:
INSERT into Entity_Email(bsCol, EmailAddress, xmlID, xmlPID)
SELECT DENSE_RANK() OVER( ORDER BY y.parentid ) AS elementid, z.EmailAddress, y.parentid, z.ID
FROM OPENXML( #hDoc, '//Entity_Emails', 1 )
WITH (parentid int '#mp:parentid', id int '#mp:id' ) y
INNER JOIN OPENXML(#hDoc, N'//Entity_Emails/Entity_Email',1) WITH (EmailAddress nvarchar(100), xmlID int '#mp:id', parentid int '#mp:parentid') as z
ON y.id = z.parentid
WHERE #pRI.value('(//Entity_Emails/Entity_Email/#Method)[1]','nvarchar(50)') = 'Insert';
As-is, all three email addresses will be inserted, even though the first and last email node do not have a 'Method" attribute. However if I add 'Method = "DontAdd"' to the other two email addresses, nothing gets inserted.
I have also tried using the predicate:
WHERE #pRI.exist('//Entity_Emails/Entity_Email[#Method="Insert"]') =1;
The result is similar - it inserts all rows, and seems to ignore the fact that two of the Email_Address elements do not have an attribute Method="Insert", regardless of whether the Method attribute exists.
The goal is to filter the xml as it is shredded and only add the email address with the attribute Method="Insert". Right now what I believe I have is actually "If you find Method = 'Insert' in the dataset, insert all rows" vs. "if you find method = 'insert', insert only those rows which have that attribute."
Thank you in advance.
Please note the following answer for those that might be helped in the future. After retrieving the column 'Method' in the z aliased query, I was able to use standard t-sql to filter the results correctly and then insert the correct rows.
INSERT into Entity_Email(bsCol, EmailAddress, xmlID, xmlPID)
SELECT DENSE_RANK() OVER( ORDER BY y.parentid ) AS elementid, z.EmailAddress, z.xmlID, y.parentid
FROM OPENXML( #hDoc, '//Entity_Emails', 1 )
WITH (parentid int '#mp:parentid', id int '#mp:id' ) y
INNER JOIN OPENXML(#hDoc, N'//Entity_Emails/Entity_Email',1) WITH (EmailAddress nvarchar(100), xmlID int '#mp:id', parentid int '#mp:parentid', Method nvarchar(50) '#Method') as z
ON y.id = z.parentid
WHERE z.Method = 'Insert'
In the T-SQL sample code below I am trying to query related pieces of data that are in different nodes in the xml, but I can't figure out how to do that. For example, the LX01_AssignedNumber and the C00302_ProcedureCode values need to be pulled together for the same record. The result should look like this.
CLAIM_SOURCE_ID ITEM_NUMBER HCPCS_LINE_CODE
16202E123456 1 99203
16202E123456 2 96372
Can someone help me?
USE [tempdb];
GO
DECLARE #XML XML =
N'<ns1:X12EnrichedMessage xmlns:ns1="http://schemas.microsoft.com/BizTalk/EDI/EDIFACT/2006/EnrichedMessageXML">
<TransactionSet>
<!-- ProcessLogID=PLG0007182226 ;ProcessLogDetailID=PLG0007182968 ;EnvID=1;RetryCount=1 -->
<ns0:X12_00501_837_P xmlns:ns0="http://schemas.microsoft.com/BizTalk/EDI/X12/2006">
<ns0:TS837_2000A_Loop xmlns:ns0="http://schemas.microsoft.com/BizTalk/EDI/X12/2006">
<ns0:TS837_2000B_Loop xmlns:ns0="http://schemas.microsoft.com/BizTalk/EDI/X12/2006">
<ns0:TS837_2300_Loop xmlns:ns0="http://schemas.microsoft.com/BizTalk/EDI/X12/2006">
<ns0:TS837_2400_Loop>
<ns0:LX_ServiceLineNumber>
<LX01_AssignedNumber>1</LX01_AssignedNumber>
</ns0:LX_ServiceLineNumber>
<ns0:SV1_ProfessionalService>
<ns0:C003_CompositeMedicalProcedureIdentifier>
<C00301_ProductorServiceIDQualifier>HC</C00301_ProductorServiceIDQualifier>
<C00302_ProcedureCode>99203</C00302_ProcedureCode>
<C00303_ProcedureModifier>25</C00303_ProcedureModifier>
<C00307_Description>NO DESCRIPTION</C00307_Description>
</ns0:C003_CompositeMedicalProcedureIdentifier>
<SV102_LineItemChargeAmount>167.82</SV102_LineItemChargeAmount>
<SV103_UnitorBasisforMeasurementCode>UN</SV103_UnitorBasisforMeasurementCode>
<SV104_ServiceUnitCount>1</SV104_ServiceUnitCount>
<ns0:C004_CompositeDiagnosisCodePointer>
<C00401_DiagnosisCodePointer>1</C00401_DiagnosisCodePointer>
</ns0:C004_CompositeDiagnosisCodePointer>
</ns0:SV1_ProfessionalService>
<ns0:DTP_SubLoop_2>
<ns0:DTP_Date_ServiceDate>
<DTP01_DateTimeQualifier>472</DTP01_DateTimeQualifier>
<DTP02_DateTimePeriodFormatQualifier>RD8</DTP02_DateTimePeriodFormatQualifier>
<DTP03_ServiceDate>20160627-20160627</DTP03_ServiceDate>
</ns0:DTP_Date_ServiceDate>
</ns0:DTP_SubLoop_2>
</ns0:TS837_2400_Loop>
<ns0:TS837_2400_Loop>
<ns0:LX_ServiceLineNumber>
<LX01_AssignedNumber>2</LX01_AssignedNumber>
</ns0:LX_ServiceLineNumber>
<ns0:SV1_ProfessionalService>
<ns0:C003_CompositeMedicalProcedureIdentifier>
<C00301_ProductorServiceIDQualifier>HC</C00301_ProductorServiceIDQualifier>
<C00302_ProcedureCode>96372</C00302_ProcedureCode>
<C00307_Description>NO DESCRIPTION</C00307_Description>
</ns0:C003_CompositeMedicalProcedureIdentifier>
<SV102_LineItemChargeAmount>82.56</SV102_LineItemChargeAmount>
<SV103_UnitorBasisforMeasurementCode>UN</SV103_UnitorBasisforMeasurementCode>
<SV104_ServiceUnitCount>2</SV104_ServiceUnitCount>
<ns0:C004_CompositeDiagnosisCodePointer>
<C00401_DiagnosisCodePointer>2</C00401_DiagnosisCodePointer>
</ns0:C004_CompositeDiagnosisCodePointer>
</ns0:SV1_ProfessionalService>
<ns0:DTP_SubLoop_2>
<ns0:DTP_Date_ServiceDate>
<DTP01_DateTimeQualifier>472</DTP01_DateTimeQualifier>
<DTP02_DateTimePeriodFormatQualifier>RD8</DTP02_DateTimePeriodFormatQualifier>
<DTP03_ServiceDate>20160627-20160627</DTP03_ServiceDate>
</ns0:DTP_Date_ServiceDate>
</ns0:DTP_SubLoop_2>
</ns0:TS837_2400_Loop>
</ns0:TS837_2300_Loop>
</ns0:TS837_2000B_Loop>
</ns0:TS837_2000A_Loop>
</ns0:X12_00501_837_P>
</TransactionSet>
</ns1:X12EnrichedMessage>'
IF OBJECT_ID(N'tempdb..#CLAIM_XML', N'U') IS NOT NULL
DROP TABLE #CLAIM_XML;
CREATE TABLE #CLAIM_XML (
CLAIM_SOURCE_ID VARCHAR(20) NOT NULL
,RAW_XML XML NOT NULL
,CLAIM_FORM_TYPE CHAR(1) NOT NULL
,CREATED_DATE DATE NOT NULL
,CONSTRAINT CLAIM_XML_PK PRIMARY KEY (CLAIM_SOURCE_ID)
);
CREATE PRIMARY XML INDEX CLAIM_XML_RAW_XML_IDX
ON #CLAIM_XML (RAW_XML);
INSERT INTO #CLAIM_XML
([CLAIM_SOURCE_ID]
,[RAW_XML]
,[CLAIM_FORM_TYPE]
,[CREATED_DATE])
VALUES('16202E123456'
,#XML
,'H'
,CONVERT(DATE, DATEADD(DAY, -1, GETDATE())));
WITH XMLNAMESPACES ('http://schemas.microsoft.com/BizTalk/EDI/EDIFACT/2006/EnrichedMessageXML' AS ns1
,'http://schemas.microsoft.com/BizTalk/EDI/X12/2006' AS ns0)
SELECT [CX].[CLAIM_SOURCE_ID]
,[ITEM_NUMBER] = LineNumber.ref.value('text()[1]', 'int')
,[HCPCS_LINE_CODE] = [CX].[RAW_XML].value('(/ns1:X12EnrichedMessage/TransactionSet/ns0:X12_00501_837_P/ns0:TS837_2000A_Loop/ns0:TS837_2000B_Loop/ns0:TS837_2300_Loop/ns0:TS837_2400_Loop/ns0:SV1_ProfessionalService/ns0:C003_CompositeMedicalProcedureIdentifier/C00302_ProcedureCode)[1]','varchar(100)')
FROM #CLAIM_XML AS [CX]
CROSS APPLY [CX].[RAW_XML].nodes('/ns1:X12EnrichedMessage/TransactionSet/ns0:X12_00501_837_P/ns0:TS837_2000A_Loop/ns0:TS837_2000B_Loop/ns0:TS837_2300_Loop/ns0:TS837_2400_Loop/ns0:LX_ServiceLineNumber/*') LineNumber(ref)
WHERE [CX].[CLAIM_FORM_TYPE] = 'H'
AND [CX].[CREATED_DATE] = CONVERT(DATE, DATEADD(DAY, -1, GETDATE()));
Use multiple CROSS APPLYs to access the different parts of the XML, like this:
;WITH XMLNAMESPACES ('http://schemas.microsoft.com/BizTalk/EDI/EDIFACT/2006/EnrichedMessageXML' AS ns1
,'http://schemas.microsoft.com/BizTalk/EDI/X12/2006' AS ns0)
SELECT
c.[CLAIM_SOURCE_ID],
sln.c.value('(LX01_AssignedNumber/text())[1]', 'INT') AS [ITEM_NUMBER],
ps.c.value('(ns0:C003_CompositeMedicalProcedureIdentifier/C00302_ProcedureCode/text())[1]', 'INT') AS [HCPCS_LINE_CODE]
FROM #CLAIM_XML c
CROSS APPLY c.[RAW_XML].nodes('/ns1:X12EnrichedMessage/TransactionSet/ns0:X12_00501_837_P/ns0:TS837_2000A_Loop/ns0:TS837_2000B_Loop/ns0:TS837_2300_Loop/ns0:TS837_2400_Loop') l(c)
CROSS APPLY l.c.nodes('ns0:LX_ServiceLineNumber') sln(c)
CROSS APPLY l.c.nodes('ns0:SV1_ProfessionalService') ps(c)
No need for multiple CROSS APPLY with .nodes(). As the values you want to read are single occurance within their tree you can address them directly:
;WITH XMLNAMESPACES ('http://schemas.microsoft.com/BizTalk/EDI/EDIFACT/2006/EnrichedMessageXML' AS ns1
,'http://schemas.microsoft.com/BizTalk/EDI/X12/2006' AS ns0)
SELECT
loop2400.value('(ns0:LX_ServiceLineNumber/LX01_AssignedNumber)[1]', 'INT') AS [ITEM_NUMBER],
loop2400.value('(ns0:SV1_ProfessionalService/ns0:C003_CompositeMedicalProcedureIdentifier/C00302_ProcedureCode)[1]', 'INT') AS [HCPCS_LINE_CODE]
FROM #xml.nodes('/ns1:X12EnrichedMessage/TransactionSet/ns0:X12_00501_837_P/ns0:TS837_2000A_Loop/ns0:TS837_2000B_Loop/ns0:TS837_2300_Loop/ns0:TS837_2400_Loop') A(loop2400)
The lazy approach works too, but - in general - it's good advise to be as specific as possible...
SELECT
loop2400.value('(*//LX01_AssignedNumber)[1]', 'INT') AS [ITEM_NUMBER],
loop2400.value('(*//C00302_ProcedureCode)[1]', 'INT') AS [HCPCS_LINE_CODE]
FROM #xml.nodes('//*:TS837_2400_Loop') A(loop2400)
On SQL Server 2008 R2, I am trying to read XML value as table.
So far, I am here :
DECLARE #XMLValue AS XML;
SET #XMLValue = '<SearchQuery>
<ResortID>1453</ResortID>
<CheckInDate>2011-10-27</CheckInDate>
<CheckOutDate>2011-11-04</CheckOutDate>
<Room>
<NumberOfADT>2</NumberOfADT>
<CHD>
<Age>10</Age>
</CHD>
<CHD>
<Age>12</Age>
</CHD>
</Room>
<Room>
<NumberOfADT>1</NumberOfADT>
</Room>
<Room>
<NumberOfADT>1</NumberOfADT>
<CHD>
<Age>7</Age>
</CHD>
</Room>
</SearchQuery>';
SELECT
Room.value('(NumberOfADT)[1]', 'INT') AS NumberOfADT
FROM #XMLValue.nodes('/SearchQuery/Room') AS SearchQuery(Room);
As you can see, Room node sometimes get CHD child nodes but sometimes don't.
Assume that I am getting this XML value as a Stored Procedure parameter. So, I need to work with the values in order to query my database tables. What would be the best way to read this XML parameter entirely?
EDIT
I think I need to express what I am expecting in return here. The below script code is for the table what I need here :
DECLARE #table AS TABLE(
ResorrtID INT,
CheckInDate DATE,
CheckOutDate DATE,
NumberOfADT INT,
CHDCount INT,
CHDAges NVARCHAR(100)
);
For the XML value I have provide above, the below Insert t-sql is suitable :
INSERT INTO #table VALUES(1453, '2011-10-27', '2011-11-04', 2, 2, '10;12');
INSERT INTO #table VALUES(1453, '2011-10-27', '2011-11-04', 1, 0, NULL);
INSERT INTO #table VALUES(1453, '2011-10-27', '2011-11-04', 1, 1, '7');
CHDCount is for the number of CHD nodes under Room node. Also, how many Room node I have, that many table row I am having here.
As for how it should look, see the below picture :
Actually, this code is for hotel reservation search query. So, I need
to work with these values I got from XML parameter to query my tables
and return available rooms. I am telling this because maybe it helps
you guys to see it through. I am not looking for a complete code for
room reservation system. That would be so selfish.
select S.X.value('ResortID[1]', 'int') as ResortID,
S.X.value('CheckInDate[1]', 'date') as CheckInDate,
S.X.value('CheckOutDate[1]', 'date') as CheckOutDate,
R.X.value('NumberOfADT[1]', 'int') as NumberOfADT,
R.X.value('count(CHD)', 'int') as CHDCount,
stuff((select ';'+C.X.value('.', 'varchar(3)')
from R.X.nodes('CHD/Age') as C(X)
for xml path('')), 1, 1, '') as CHDAges
from #XMLValue.nodes('/SearchQuery') as S(X)
cross apply S.X.nodes('Room') as R(X)
This should get you close:
SELECT ResortID = #xmlvalue.value('(//ResortID)[1]', 'int')
, CheckInDate = #xmlvalue.value('(//CheckInDate)[1]', 'date')
, CheckOutDate = #xmlvalue.value('(//CheckOutDate)[1]', 'date')
, NumberOfAdt = Room.value('(NumberOfADT)[1]', 'INT')
, CHDCount = Room.value('count(./CHD)', 'int')
, CHDAges = Room.query('for $c in ./CHD
return concat(($c/Age)[1], ";")').value('(.)[1]',
'varchar(100)')
FROM #XMLValue.nodes('/SearchQuery/Room') AS SearchQuery ( Room ) ;
This is my table
BasketId(int) BasketName(varchar) BasketFruits(xml)
1 Gold <FRUITS><FID>1</FID><FID>2</FID><FID>3</FID><FID>4</FID><FID>5</FID><FID>6</FID></FRUITS>
2 Silver <FRUITS><FID>1</FID><FID>2</FID><FID>3</FID><FID>4</FID></FRUITS>
3 Bronze <FRUITS><FID>3</FID><FID>4</FID><FID>5</FID></FRUITS>
I need to search for the basket which has FID values 1 and 3
so that in this case i would get Gold and Silver
Although i've reached to the result where i can search for a SINGLE FID value like 1
using this code:
declare #fruitId varchar(10);
set #fruitId=1;
select * from Baskets
WHERE BasketFruits.exist('//FID/text()[contains(.,sql:variable("#fruitId"))]') = 1
HAD it been T-SQL i would have used the IN Clause like this
SELECT * FROM Baskets where FID in (1,3)
Any help/workaround appreciated...
First option would be to add another exist the where clause.
declare #fruitId1 int;
set #fruitId1=1;
declare #fruitId2 int;
set #fruitId2=3;
select *
from #Test
where
BasketFruits.exist('/FRUITS/FID[.=sql:variable("#fruitId1")]')=1 and
BasketFruits.exist('/FRUITS/FID[.=sql:variable("#fruitId2")]')=1
Another version would be to use both variables in the xquery statement, counting the hits.
select *
from #Test
where BasketFruits.value(
'count(distinct-values(/FRUITS/FID[.=(sql:variable("#fruitId1"),sql:variable("#fruitId2"))]))', 'int') = 2
The two queries above will work just fine if you know how many FID parameters you are going to use when you write the query. If you are in a situation where the number of FID's vary you could use something like this instead.
declare #FIDs xml = '<FID>1</FID><FID>3</FID>'
;with cteParam(FID) as
(
select T.N.value('.', 'int')
from #FIDs.nodes('FID') as T(N)
)
select T.BasketName
from #Test as T
cross apply T.BasketFruits.nodes('/FRUITS/FID') as F(FID)
inner join cteParam as p
on F.FID.value('.', 'int') = P.FID
group by T.BasketName
having count(T.BasketName) = (select count(*) from cteParam)
Build the #FIDs variable as an XML to hold the values you want to use in the query.
You can test the last query here: https://data.stackexchange.com/stackoverflow/q/101600/relational-division-with-xquery
It is a bit more involved than I hoped it would be - but this solution works.
Basically, I'm using a CTE (Common Table Expression) which breaks up the table and cross joins all values from the <FID> nodes to the basket names.
From that CTE, I select those baskets that contain both a value of 1 and 3.
DECLARE #Test TABLE (BasketID INT, BasketName VARCHAR(20), BasketFruits XML)
INSERT INTO #TEST
VALUES(1, 'Gold', '<FRUITS><FID>1</FID><FID>2</FID><FID>3</FID><FID>4</FID><FID>5</FID><FID>6</FID></FRUITS>'),
(2, 'Silver', '<FRUITS><FID>1</FID><FID>2</FID><FID>3</FID><FID>4</FID></FRUITS>'),
(3, 'Bronze', '<FRUITS><FID>3</FID><FID>4</FID><FID>5</FID></FRUITS>')
;WITH IDandFID AS
(
SELECT
t.BasketID,
t.BasketName,
FR.FID.value('(.)[1]', 'int') AS 'FID'
FROM #Test t
CROSS APPLY basketfruits.nodes('/FRUITS/FID') AS FR(FID)
)
SELECT DISTINCT
BasketName
FROM
IDandFID i1
WHERE
EXISTS(SELECT * FROM IDandFID i2 WHERE i1.BasketID = i2.BasketID AND i2.FID = 1)
AND EXISTS(SELECT * FROM IDandFID i3 WHERE i1.BasketID = i3.BasketID AND i3.FID = 3)
Running this query, I do get the expected output of:
BasketName
----------
Gold
Silver
Is this too trivial?
SELECT * FROM Baskets WHERE BasketFruits LIKE '%<FID>1</FID>%' AND BasketFruits LIKE '%<FID>3</FID>%'