I am trying to get the particular attribute value from XML column, but I'm getting an error
XML parsing: line 1, character 345, duplicate attribute
My code:
select
ship_to_cust_num,
tank_num,
tank_capacity_qty,
tank_pkg_type_code,
COALESCE(REPLACE(CAST(CAST(b.tank_inspection AS NTEXT) AS XML).value('(/TankInspection/Questions/Question[#AASAQno="9"]/#QAns)[1]', 'VARCHAR(50)'), '#', ''), 0)
from
bulk_site_tank (nolock)b
where
convert(varchar, b.tank_inspection) != 'NULL'
The simple answer is that the error is telling you the problem. But to explain further. Take this simple statement:
DECLARE #xml varchar(MAX);
SET #XML = '
<root>
<child>
<element attribute="1">value</element>
<element attribute="2" attribute="2">Another Value</element>
</child>
</root>';
SELECT *
FROM (VALUES(CONVERT(xml, #XML)))V(X);
If you run that, you'll get the error:
Msg 9437, Level 16, State 1, Line 11 XML parsing: line 5, character 46, duplicate attribute
Unsurprising, as if you look, the second element node has attribute declared twice.
So, how do you fix this?
Firstly, this means that you're storing your XML data as a datatype other than in an xml data type. XML should be stored using the xml data type (that's exactly what it's for), and only valid XML can be stored in it; as a result you wouldn't have been able to insert invalid XML into the row and wouldn't be in this position. As you are, there's only one thing you can do; find all the "bad" rows:
SELECT tank_inspection
FROM bulk_site_tank
WHERE TRY_CONVERT(xml,tank_inspection) IS NULL
AND tank_inspection IS NOT NULL;
Then inspect every single row returned in the above dataset and fix the data. Make it valid XML. After that, fix your data type:
ALTER TABLE bulk_site_tank ALTER COLUMN tank_inspection xml;
Now everything is valid XML, you can fix that query of yours:
SELECT ship_to_cust_num,
tank_num,
tank_capacity_qty,
tank_pkg_type_code,
REPLACE(b.tank_inspection.value('(/TankInspection/Questions/Question[#AASAQno="9"]/#QAns)[1]', 'varchar(50)'), '#', '') --AS ?
FROM bulk_site_tank b
WHERE b.tank_inspection IS NOT NULL;
Note I change to ANSI_NULL syntax, and got rid of the NOLOCK (as I assume you don't know what it actually does here). The CAST/CONVERT expressions are gone too, and I've removed the COALESCE. As your value expression returns a varchar(50) and the COALESCE has a 0 for the second parameter. This would implicitly cast the value returned from the XML to an int and likely result in a conversion error.
I'm afraid it's up to you to clean up your data though, no one else can help you here I'm afraid. This is just one reason why poor data type choices is a problem; as if the correct data type was used then,as I said before, the invalid XML could never have been inserted.
Good luck!
Related
This question already has answers here:
What characters do I need to escape in XML documents?
(10 answers)
Closed 17 days ago.
I have handle &, " symbols but I am stuck with greater than and less than symbols.
I want to store data in a table with the correct form.
Auto < Test
in my XML while I am passing '<' symbol it occurs error same happen with '>' also
XML that contain < >
'<root>
<name>AutoTest!#$^&(<>|;%#~Org.</name>
</root>'
My Try
DECLARE #OrgNames NVARCHAR(MAX)= '<root>
<name>Auto < Test</name>
</root>'
DECLARE #columnStr_XML XML
SET #columnStr_XML = CAST('<root1>'+
REPLACE( REPLACE(#OrgNames, '&', 'ATampZ'), '"' , 'ATdouble') +
'</root1>' AS xml )
SELECT f.x.value('.', 'NVARCHAR(MAX)') AS name,
CAST(NULL AS BIGINT) AS org_id
into #temp_name
FROM #columnStr_XML.nodes('/root1/root/name') f(x)
WHERE f.x.value('.', 'NVARCHAR(max)')<>''
UPDATE #temp_name
SET name = REPLACE (REPLACE(name, 'ATampZ', '&'), 'ATdouble' , '"')
SELECT * FROM #temp_name
DROP TABLE #temp_name
It's unclear where your parsing error comes from, executing the SQL?
You should decide, if you store only valid xml in your database or if it is also allowed to store invalid xml.
To only store valid xml, you should replace all characters that could mix up the xml in advance. Something like htmlentities() in PHP.
https://www.php.net/manual/en/function.htmlentities.php
When retrieving the data back from the parsed xml, you can use the opposite to get the original data back, e.g. html_entity_decode().
When storing of invalid xml should be allowed, you can set up your parser to end parsing in a defined state, telling that there is no valid xml.
I do have to replicate an XML file with SQL Server and I am now stumbling over the following structure inside the XML file and I don't know how to replicate that.
The structure looks like this at the moment for certain tags:
<ART_TAG1>
<UNMLIMITED/>
</ART_TAG1>
<ART_TAG2>
<ART_TAG3>
<Data_Entry/>
</ART_TAG3>
</ART_TAG2>
I am wondering if this is proper XML that the data inside (unlimited and Data_Entry) is enclosed with a closing XML tag. The XML validator https://www.w3schools.com/xml/xml_validator.asp is telling me this is correct. But now I am struggling with replicating that with Transact-SQL.
If I try to replicate that I can only come up with the following TSQL script, which obviously does not fully look like the original.
SELECT 'UNLIMITED' as 'ART_TAG1'
, 'Data_Entry' as 'ART_TAG2/ART_TAG3'
FOR XML PATH(''), ROOT('root')
<root>
<ART_TAG1>UNLIMITED</ART_TAG1>
<ART_TAG2>
<ART_TAG3>Data_Entry</ART_TAG3>
</ART_TAG2>
</root>
If I get this correctly, your question is:
How can I put my query to create those <SomeElement /> tags?
Look at this:
--This will create filled nodes
SELECT 'outer' AS [OuterNode/#attr]
,'inner' AS [OuterNode/InnerNode]
FOR XML PATH('row');
--The empty string is some kind of content
SELECT 'outer' AS [OuterNode/#attr]
,'' AS [OuterNode/InnerNode]
FOR XML PATH('row');
--the missing value (NULL) is omited by default
SELECT 'outer' AS [OuterNode/#attr]
,NULL AS [OuterNode/InnerNode]
FOR XML PATH('row');
--Now check what happens here:
--First XML has an empty element, while the second uses the self-closing element
DECLARE #xml1 XML=
N'<row>
<OuterNode attr="outer">
<InnerNode></InnerNode>
</OuterNode>
</row>';
DECLARE #xml2 XML=
N'<row>
<OuterNode attr="outer">
<InnerNode/>
</OuterNode>
</row>';
SELECT #xml1,#xml2;
The result is the same for both...
Some background: Semantically the empty element <element></element> is exactly the same as the self-closing element <element />. It should not make any difference, whether you use the one or the other. If your consumer cannot deal with this, it is a problem in the reading part.
Yes, you can force any content into XML on string level, but - as the example shows above - this is just a (dangerous) hack.
XML within T-SQL returns - by default - a missing node as NULL and an empty element as empty (depending on the datatype, and beware of the difference between an element and its text() node).
In short: This is nothing you should have to think about...
Context: I'm scraping some XML form descriptions from a Web Services table in hopes of using that name to identify what the user has inputted as response. Since this description changes for each step (row) of the process and each product I want something that can evaluate dynamically.
What I tried: The following was quite useful but it returns a dynamic attribute query result in it's own field ans using a coalesce to reduce the results as one field would lead to it's own complications: Get values from XML tags with dynamically specified data fields
Current Attempt:
I'm using the following code to generate the attribute name that I will use in the next step to query the attribute's value:
case when left([Return], 5) = '<?xml'
then lower(cast([Return] as xml).value('(/response/form/*/#name)[1]','varchar(30)'))
else ''
end as [FormRequest]
And as part of step 2 I have used the STUFF function to try and make the row-level query possible
case when len(FormRequest)>0
then stuff( ',' + 'cast([tmpFormResponse] as xml).value(''(/wrapper/#' + [FormRequest] + ')[1]'',''varchar(max)'')', 1, 1, '')
else ''
end as [FormResponse]
Instead of seeing 1 returned as my FormReponse feild value for the submit attribute (please see in yellow below) it's returning the query text -- cast([tmpFormResponse] as xml).value('(/wrapper/#submit)1','varchar(max)') -- instead (that which should be queried).
How should I action the value method so that I can dynamically strip out the response per row of XML data in tmpFormResponse based on the field value in the FormRequest field?
Thanx
You can check this out:
DECLARE #xml XML=
N'<root>
<SomeAttributes a="a" b="b" c="c"/>
<SomeAttributes a="aa" b="bb" c="cc"/>
</root>';
DECLARE #localName NVARCHAR(100)='b';
SELECT sa.value(N'(./#*[local-name()=sql:variable("#localName")])[1]','nvarchar(max)')
FROM #xml.nodes(N'/root/SomeAttributes') AS A(sa)
Ended up hacking up a solution to the problem by using PATINDEX and CHARINDEX to look for the value in the [FormRequest] field in the he tmpFormResponse field.
The following code tries to get the attribute a of the first node y in SQL Server.
declare #x xml = '<x><y a="1" /><y a="2" /></x>'
select #x.query('/x/y[1]/#a')
select #x.query('(/x/y/#a)[1]')
However, it got the error of
Msg 6307, Level 16, State 1, Line 5
XML well-formedness check: Attribute cannot appear outside of element declaration. Rewrite your XQuery so it returns well-formed XML.
If I understand
Example
declare #x xml = '<x><y a="1" /><y a="2" /></x>'
select #x.value('x[1]/y[1]/#a','varchar(max)')
Returns
1
This works
select #x.query('data(/x/y[1]/#a)')
I have a table with an ntext type column that holds xml. I have tried to apply many examples of how to pull the value for the company's name from the xml for a particular node, but continue to get a syntax error. Below is what I've done, except substituted my select statement for the actual xml output
DECLARE #companyxml xml
SET #companyxml =
'<Home>
<slideshowImage1>1105</slideshowImage1>
<slideshowImage2>1106</slideshowImage2>
<slideshowImage3>1107</slideshowImage3>
<slideshowImage4>1108</slideshowImage4>
<slideshowImage5>1109</slideshowImage5>
<bottomNavImg1>1155</bottomNavImg1>
<bottomNavImg2>1156</bottomNavImg2>
<bottomNavImg3>1157</bottomNavImg3>
<pageTitle>Acme Capital Management |Homepage</pageTitle>
<metaKeywords><![CDATA[]]></metaKeywords>
<metaDescription><![CDATA[]]></metaDescription>
<companyName>Acme Capital Management</companyName>
<logoImg>1110</logoImg>
<pageHeader></pageHeader>
</Home>'
SELECT c.value ('companyName','varchar(1000)') AS companyname
FROM #companyxml.nodes('/Home') AS c
For some reason, the select c.value statement has a syntax problem that I can't figure out. On hover in SSMS, it says 'cannot find either column "c" or the user-defined function or aggregate "c.value", or the name is ambiguous.'
Any help on the syntax would be greatly appreciated.
try this
DECLARE #companyxml xml
SET #companyxml =
'<Home>
<slideshowImage1>1105</slideshowImage1>
<slideshowImage2>1106</slideshowImage2>
<slideshowImage3>1107</slideshowImage3>
<slideshowImage4>1108</slideshowImage4>
<slideshowImage5>1109</slideshowImage5>
<bottomNavImg1>1155</bottomNavImg1>
<bottomNavImg2>1156</bottomNavImg2>
<bottomNavImg3>1157</bottomNavImg3>
<pageTitle>Acme Capital Management Homepage</pageTitle>
<metaKeywords>CDATA</metaKeywords>
<metaDescription>CDATA</metaDescription>
<companyName>Acme Capital Management</companyName>
<logoImg>1110</logoImg>
<pageHeader></pageHeader>
</Home>'
DECLARE #Result AS varchar(50)
SET #result = #companyxml.value('(/Home/companyName/text())[1]','varchar(50)')
SELECT #result