This question already has answers here:
What characters do I need to escape in XML documents?
(10 answers)
Closed 17 days ago.
I have handle &, " symbols but I am stuck with greater than and less than symbols.
I want to store data in a table with the correct form.
Auto < Test
in my XML while I am passing '<' symbol it occurs error same happen with '>' also
XML that contain < >
'<root>
<name>AutoTest!#$^&(<>|;%#~Org.</name>
</root>'
My Try
DECLARE #OrgNames NVARCHAR(MAX)= '<root>
<name>Auto < Test</name>
</root>'
DECLARE #columnStr_XML XML
SET #columnStr_XML = CAST('<root1>'+
REPLACE( REPLACE(#OrgNames, '&', 'ATampZ'), '"' , 'ATdouble') +
'</root1>' AS xml )
SELECT f.x.value('.', 'NVARCHAR(MAX)') AS name,
CAST(NULL AS BIGINT) AS org_id
into #temp_name
FROM #columnStr_XML.nodes('/root1/root/name') f(x)
WHERE f.x.value('.', 'NVARCHAR(max)')<>''
UPDATE #temp_name
SET name = REPLACE (REPLACE(name, 'ATampZ', '&'), 'ATdouble' , '"')
SELECT * FROM #temp_name
DROP TABLE #temp_name
It's unclear where your parsing error comes from, executing the SQL?
You should decide, if you store only valid xml in your database or if it is also allowed to store invalid xml.
To only store valid xml, you should replace all characters that could mix up the xml in advance. Something like htmlentities() in PHP.
https://www.php.net/manual/en/function.htmlentities.php
When retrieving the data back from the parsed xml, you can use the opposite to get the original data back, e.g. html_entity_decode().
When storing of invalid xml should be allowed, you can set up your parser to end parsing in a defined state, telling that there is no valid xml.
Related
I have a nvarchar column that I would like to return embedded in my JSON results if the contents is valid JSON, or as a string otherwise.
Here is what I've tried:
select
(
case when IsJson(Arguments) = 1 then
Json_Query(Arguments)
else
Arguments
end
) Results
from Unit
for json path
This always puts Results into a string.
The following works, but only if the attribute contains valid JSON:
select
(
Json_Query(
case when IsJson(Arguments) = 1 then
Arguments
else
'"' + String_escape(IsNull(Arguments, ''), 'json') + '"' end
)
) Results
from Unit
for json path
If Arguments does not contain a JSON object a runtime error occurs.
Update: Sample data:
Arguments
---------
{ "a": "b" }
Some text
Update: any version of SQL Server will do. I'd even be happy to know that it's coming in a beta or something.
I did not find a good solution and would be happy, if someone comes around with a better one than this hack:
DECLARE #tbl TABLE(ID INT IDENTITY,Arguments NVARCHAR(MAX));
INSERT INTO #tbl VALUES
(NULL)
,('plain text')
,('[{"id":"1"},{"id":"2"}]');
SELECT t1.ID
,(SELECT Arguments FROM #tbl t2 WHERE t2.ID=t1.ID AND ISJSON(Arguments)=0) Arguments
,(SELECT JSON_QUERY(Arguments) FROM #tbl t2 WHERE t2.ID=t1.ID AND ISJSON(Arguments)=1) ArgumentsJSON
FROM #tbl t1
FOR JSON PATH;
As NULL-values are omitted, you will always find eiter Arguments or ArgumentsJSON in your final result. Treating this JSON as NVARCHAR(MAX) you can use REPLACE to rename all to the same Arguments.
The problem seems to be, that you cannot include two columns with the same name within your SELECT, but each column must have a predictable type. This depends on the order you use in CASE (or COALESCE). If the engine thinks "Okay, here's text", all will be treated as text and your JSON is escaped. But if the engine thinks "Okay, some JSON", everything is handled as JSON and will break if this JSON is not valid.
With FOR XML PATH there are some tricks with column namig (such as [*], [node()] or even twice the same within one query), but FOR JSON PATH is not that powerfull...
When you say that your statement "... always puts Results into a string.", you probably mean that when JSON is stored in a text column, FOR JSON escapes this text. Of course, if you want to return an unescaped JSON text, you need to use JSON_QUERY function only for your valid JSON text.
Next is a small workaround (based on FOR JSON and string manipulation), that may help to solve your problem.
Table:
CREATE TABLE #Data (
Arguments nvarchar(max)
)
INSERT INTO #Data
(Arguments)
VALUES
('{"a": "b"}'),
('Some text'),
('{"c": "d"}'),
('{"e": "f"}'),
('More[]text')
Statement:
SELECT CONCAT(N'[', j1.JsonOutput, N',', j2.JsonOutput, N']')
FROM
(
SELECT JSON_QUERY(Arguments) AS Results
FROM #Data
WHERE ISJSON(Arguments) = 1
FOR JSON PATH, WITHOUT_ARRAY_WRAPPER
) j1 (JsonOutput),
(
SELECT STRING_ESCAPE(ISNULL(Arguments, ''), 'json') AS Results
FROM #Data
WHERE ISJSON(Arguments) = 0
FOR JSON PATH, WITHOUT_ARRAY_WRAPPER
) j2 (JsonOutput)
Output:
[{"Results":{"a": "b"}},{"Results":{"c": "d"}},{"Results":{"e": "f"}},{"Results":"Some text"},{"Results":"More[]text"}]
Notes:
One disadvantage here is that the order of the items in the generated output is not the same as in the table.
I am trying to get the particular attribute value from XML column, but I'm getting an error
XML parsing: line 1, character 345, duplicate attribute
My code:
select
ship_to_cust_num,
tank_num,
tank_capacity_qty,
tank_pkg_type_code,
COALESCE(REPLACE(CAST(CAST(b.tank_inspection AS NTEXT) AS XML).value('(/TankInspection/Questions/Question[#AASAQno="9"]/#QAns)[1]', 'VARCHAR(50)'), '#', ''), 0)
from
bulk_site_tank (nolock)b
where
convert(varchar, b.tank_inspection) != 'NULL'
The simple answer is that the error is telling you the problem. But to explain further. Take this simple statement:
DECLARE #xml varchar(MAX);
SET #XML = '
<root>
<child>
<element attribute="1">value</element>
<element attribute="2" attribute="2">Another Value</element>
</child>
</root>';
SELECT *
FROM (VALUES(CONVERT(xml, #XML)))V(X);
If you run that, you'll get the error:
Msg 9437, Level 16, State 1, Line 11 XML parsing: line 5, character 46, duplicate attribute
Unsurprising, as if you look, the second element node has attribute declared twice.
So, how do you fix this?
Firstly, this means that you're storing your XML data as a datatype other than in an xml data type. XML should be stored using the xml data type (that's exactly what it's for), and only valid XML can be stored in it; as a result you wouldn't have been able to insert invalid XML into the row and wouldn't be in this position. As you are, there's only one thing you can do; find all the "bad" rows:
SELECT tank_inspection
FROM bulk_site_tank
WHERE TRY_CONVERT(xml,tank_inspection) IS NULL
AND tank_inspection IS NOT NULL;
Then inspect every single row returned in the above dataset and fix the data. Make it valid XML. After that, fix your data type:
ALTER TABLE bulk_site_tank ALTER COLUMN tank_inspection xml;
Now everything is valid XML, you can fix that query of yours:
SELECT ship_to_cust_num,
tank_num,
tank_capacity_qty,
tank_pkg_type_code,
REPLACE(b.tank_inspection.value('(/TankInspection/Questions/Question[#AASAQno="9"]/#QAns)[1]', 'varchar(50)'), '#', '') --AS ?
FROM bulk_site_tank b
WHERE b.tank_inspection IS NOT NULL;
Note I change to ANSI_NULL syntax, and got rid of the NOLOCK (as I assume you don't know what it actually does here). The CAST/CONVERT expressions are gone too, and I've removed the COALESCE. As your value expression returns a varchar(50) and the COALESCE has a 0 for the second parameter. This would implicitly cast the value returned from the XML to an int and likely result in a conversion error.
I'm afraid it's up to you to clean up your data though, no one else can help you here I'm afraid. This is just one reason why poor data type choices is a problem; as if the correct data type was used then,as I said before, the invalid XML could never have been inserted.
Good luck!
I'm trying to update some XML data in SQL Server. The XML contains data that looks like this:
<root>
<id>1</id>
<timestamp>16-10-2017 19:24:55</timestamp>
</root>
Let's say this XML exists in a column called Data in a table called TestTable.
I would like to be able to change the hyphens in the timestamp to forward slashes.
I was hoping I might be able to do something like:
update TestTable
set Data.modify('replace value of
(/root/timestamp/text())[1] with REPLACE((/root/timestamp/text())[1], "-", "/")')
I get the following error:
XQuery [TestTable]: There is no function '{http://www.w3.org/2004/07/xpath-functions}:REPLACE()'
When I think about it, this makes sense. But I wonder, is there a way to do this in a single update statement? Or do I first need to query the timestamp value and save it as a variable, and then update the XML with the variable?
You can also do this with a join to an inline view and use the SQL REPLACE function:
CREATE TABLE TestTable
(
Id INT IDENTITY(1,1) NOT NULL,
Data XML NOT NULL
)
INSERT TestTable (Data) VALUES ('<root>
<id>1</id>
<timestamp>16-10-2017 19:24:55</timestamp>
</root>')
UPDATE TestTable
SET Data.modify('replace value of
(/root/timestamp/text())[1] with sql:column("T2.NewData")')
FROM TestTable T1
INNER JOIN (
SELECT Id
, REPLACE( Data.value('(/root/timestamp/text())[1]', 'nvarchar(max)'), '-', '/') AS NewData
FROM TestTable
) T2
ON T1.Id = T2.Id
SELECT * FROM TestTable
Note: this answer assumes you want to have this formatted for the purpose of displaying this as a string, and not parsing the content as a xs:dateTime. If you want the latter, Shungo's answer will format it as such.
It seems that replace is not a supported XQuery function in SQL Server at the time of this writing. You can use the substring function along with the concat function in a "replace value of (XML DML)" though.
CREATE TABLE #t(x XML);
INSERT INTO #t(x)VALUES(N'<root><id>1</id><timestamp>16-10-2017 19:24:55</timestamp></root>');
UPDATE
#t
SET
x.modify('replace value of (/root/timestamp/text())[1]
with concat(substring((/root/timestamp/text())[1],1,2),
"/",
substring((/root/timestamp/text())[1],4,2),
"/",
substring((/root/timestamp/text())[1],7)
) ')
SELECT*FROM #t;
Giving as a result:
<root><id>1</id><timestamp>16/10/2017 19:24:55</timestamp></root>
If there's no external need you have to fullfill, you should use ISO8601 date/time strings within XML.
Your dateTime-string is culture related. Reading this on different systems with differing language or dateformat settings will lead to errors or - even worse!!! - to wrong results.
A date like "08-10-2017" can be the 8th of October or the 10th of August...
The worst point is, that this might pass all your tests successfully, but will break on a customer's machine with strange error messages or bad results down to real data dammage!
Switching the hyphens to slashes is just cosmetic! An XML is a strictly defined data container. Any non-string data must be represented as a secure convertible string.
This is what you should do:
DECLARE #tbl TABLE(ID INT IDENTITY,YourXML XML);
INSERT INTO #tbl VALUES
(N'<root>
<id>1</id>
<timestamp>16-10-2017 19:24:55</timestamp>
</root>');
UPDATE #tbl SET YourXml.modify(N'replace value of (/root/timestamp/text())[1]
with concat( substring((/root/timestamp/text())[1],7,4), "-"
,substring((/root/timestamp/text())[1],4,2), "-"
,substring((/root/timestamp/text())[1],1,2), "T"
,substring((/root/timestamp/text())[1],12,8)
) cast as xs:dateTime?');
SELECT * FROM #tbl;
The result
<root>
<id>1</id>
<timestamp>2017-10-16T19:24:55</timestamp>
</root>
you can try string replacement like below
update testtable
set data= cast(
concat(
left(cast(data as varchar(max)),charindex('<timestamp>',cast(data as varchar(max)))+len('<timestamp>')-1),
replace(
substring(
cast(data as varchar(max)),
len('<timestamp>') +
charindex( '<timestamp>', cast(data as varchar(max))) ,
charindex('</timestamp>',cast(data as varchar(max)))
-charindex('<timestamp>',cast(data as varchar(max)))
-len('<timestamp>')
),
'-','/'),
right(cast(data as varchar(max)),len(cast(data as varchar(max)))-charindex('</timestamp>',cast(data as varchar(max)))+1)
) as xml)
select *
from testtable
working demo
I want to set a processing instruction to include a stylesheet on top of an XML:
The same issue was with the xml-declaration (e.g. <?xml version="1.0" encoding="utf-8"?>)
Desired result:
<?xml-stylesheet type="text/xsl" href="stylesheet.xsl"?>
<TestPath>
<Test>Test</Test>
<SomeMore>SomeMore</SomeMore>
</TestPath>
My research brought me to node test syntax and processing-instruction().
This
SELECT 'type="text/xsl" href="stylesheet.xsl"' AS [processing-instruction(xml-stylesheet)]
,'Test' AS Test
,'SomeMore' AS SomeMore
FOR XML PATH('TestPath')
produces this:
<TestPath>
<?xml-stylesheet type="text/xsl" href="stylesheet.xsl"?>
<Test>Test</Test>
<SomeMore>SomeMore</SomeMore>
</TestPath>
All hints I found tell me to convert the XML to VARCHAR, concatenate it "manually" and convert it back to XML. But this is - how to say - ugly?
This works obviously:
SELECT CAST(
'<?xml-stylesheet type="text/xsl" href="stylesheet.xsl"?>
<TestPath>
<Test>Test</Test>
<SomeMore>SomeMore</SomeMore>
</TestPath>' AS XML);
Is there a chance to solve this?
There is another way, which will need two steps but don't need you to treat the XML as string anywhere in the process :
declare #result XML =
(
SELECT
'Test' AS Test,
'SomeMore' AS SomeMore
FOR XML PATH('TestPath')
)
set #result.modify('
insert <?xml-stylesheet type="text/xsl" href="stylesheet.xsl"?>
before /*[1]
')
Sqlfiddle Demo
The XQuery expression passed to modify() function tells SQL Server to insert the processing instruction node before the root element of the XML.
UPDATE :
Found another alternative based on the following thread : Merge the two xml fragments into one? . I personally prefer this way :
SELECT CONVERT(XML, '<?xml-stylesheet type="text/xsl" href="stylesheet.xsl"?>'),
(
SELECT
'Test' AS Test,
'SomeMore' AS SomeMore
FOR XML PATH('TestPath')
)
FOR XML PATH('')
Sqlfiddle Demo
As it came out, har07's great answer does not work with an XML-declaration. The only way I could find was this:
DECLARE #ExistingXML XML=
(
SELECT
'Test' AS Test,
'SomeMore' AS SomeMore
FOR XML PATH('TestPath'),TYPE
);
DECLARE #XmlWithDeclaration NVARCHAR(MAX)=
(
SELECT N'<?xml version="1.0" encoding="UTF-8"?>'
+
CAST(#ExistingXml AS NVARCHAR(MAX))
);
SELECT #XmlWithDeclaration;
You must stay in the string line after this step, any conversion to real XML will either give an error (when the encoding is other then UTF-16) or will omit this xml-declaration.
I have following existing code in one of the stored procedures to all delimiter / between error messages encounters in the validations:
;with delimiting_errors
(Id,
Delimited_Error_List)
as
(
select
e2.Id,
'/'
+ (select ' ' + Fn
from Customer e
where e.Id = e2.Id
for xml path(''), type
).value('substring(text()[1],2)', 'varchar(max)') as Delimited_Error
from Customer e2
group by e2.Id
)
SELECT * FROM delimiting_errors
Request you to please help me in understanding the command
value('substring(text()[1], 2)', 'varchar(max)')
I tried to search about text(), but couldn't find exact documentation for the same.
Similarly, how substring function is working only on 2 parameters in substring(text()[1], 2), which actually requires 3 parameter.
Please help me with the concept behind this command, also please help me with some resource to read about Text().
What is going on here:
.value('substring(text()[1],2)', 'varchar(max)')
value() function to extract a specific value from the XML, and convert it to a SQL Server data type, in your case to varchar(max)
substring is XQuery substring, not SQL substring, here it returns substring starting at position 2
text() function here retrieves the inner text from within the XML
[1] suffix acts as an indexer, and fetches the first result matched
For more info read XQuery Language Reference, it's like "another language" inside SQL.
.value('substring(text()[1],2)', 'varchar(max)') as Delimited_Error
You use this XML-trick to concatenate values. In the beginning you add a double space select ' ' + Fn, this must be taken away for the beginning of the return string.
So, the .value returns the "substring" (XPath-Function!) of the inner text() starting at the index 2.
Find more information here: http://wiki.selfhtml.org/wiki/XML/XSL/XPath/Funktionen