How to handle apostrophe ( ' ) in XMl and SQL Server - sql-server

I have an XML file which has details of employees and the problem I am try to figure it out is these XML values have apostrophe ( ' ) in random places all over the XML content, so it's getting hard to insert them into SQL Server tables.
I will be sending this entire XML content from MVC C# to a SQL Server stored procedure which will insert the data into various tables, but whenever there is an apostrophe ( ' ) in XML content, the error occurs. So these apostrophes should be either handled or replaced or removed. How can I do this?
This is some sample XML content:
<xml>
<Channel>
<Program id="1" category="A">
<name>Pra'Matino</name>
<Bin>
<Date>1/1/2020</Date>
<Date>1/1/2020</Date>
</Bin>
<Player>
<Pla>S'Rajesh</Pla>
<Pla>Su'man</Pla>
</Player>
<Television>
<HostDeails>2/9/2020</HostDeails>
<HostDeails>MALE</HostDeails>
<HostDeails>Colour</HostDeails>
</Television>
<addresses>
<address>
<address1>No 10</address1>
<city>Chennai</city>
<country>IN's</country>
<ProductName>Lavender's</ProductName>
</address>
<address>
<address1>N0 72</address1>
<city>Sanagoor Road</city>
<postalCode>641006</postalCode>
</address>
<address>
<address1>Old No 10/ New No 3</address1>
<city>Madurai</city>
<country>IN</country>
<ProductName>Lavender</ProductName>
</address>
<address>
<address1>N0 98</address1>
<city>BridhSanagoor Road</city>
<country>SriLanka</country>
<postalCode>641006</postalCode>
</address>
</addresses>
</Program>
<Program id="25" category="B">
<name>Rahman'G</name>
<Bin>
<Date>10/1/2020</Date>
<Date>1/12/1989</Date>
</Bin>
<Player>
<Pla>Paul'D</Pla>
<Pla>Right'F</Pla>
</Player>
<Television>
<HostDeails>5/7/2021</HostDeails>
<HostDeails>MALE</HostDeails>
<HostDeails>C'olour</HostDeails>
</Television>
<addresses>
<address>
<address1>S7</address1>
<city>Coimbatire</city>
<country>IN</country>
<ProductName>Lavender</ProductName>
</address>
<address>
<address1>Sai Akshya Appartment</address1>
<city>Sanagoor Road</city>
<postalCode>631009</postalCode>
</address>
<address>
<address1> No 3</address1>
<city>Thenkaasi</city>
<ProductName>Lavender</ProductName>
</address>
<address>
<address1>N0 98</address1>
<city>Bridh'Sanagoor Road</city>
<country>SriLanka</country>
<postalCode>641006</postalCode>
</address>
</addresses>
</Program>
</Channel>
</xml>
Thank you all.
Copied from comment:
I doing like this
SqlCommand.Parameters.Add("#XMLValue", SqlDbType.Xml).Value = xmlDetails.ToString();

If you use Parameterized Queries (Take a look at SqlParameters), you won't be having these issues.
SqlCommand cmd = new SqlCommand(query, connection);
SqlParameter param = new SqlParameter();
param.ParameterName = "#ParamName";
param.Value = valueVariable;
try not to use string concatenation and be aware of the risks and the implications.
EDIT:
xmlDetails.ToString() dont convert xmlDetails to string. Instead parse it
SqlXml newXml = new SqlXml(new XmlTextReader("MyTestStoreData.xml"));
SqlXml newXml = new SqlXml(xmlDetails); <-- xmlDetails is needed to be either Stream or XmlReader type

Related

SQL Server reduce recurring XML nodes to JSON array

I have some XML in which every entry can contain some recurring elements. I'm trying to query it with OpenXML function and I want to reduce those elements to JSON arrays.
My SQL looks like this:
declare #idoc int,
#xml xml = '
<?xml version="1.0" encoding="UTF-8"?>
<collection>
<individual>
<id>1</id>
<address>
<coutry>Country1</coutry>
<zip>ZIP1</zip>
<city>City1</city>
</address>
<address>
<coutry>Country2</coutry>
<zip>ZIP2</zip>
<city>City2</city>
</address>
<document>
<num>101</num>
<issued>2020-01-01</issued>
<description>desc1</description>
</document>
<document>
<num>102</num>
<issued>2020-01-01</issued>
<description>desc2</description>
</document>
</individual>
<individual>
<id>2</id>
<address>
<coutry>Country3</coutry>
<zip>ZIP3</zip>
<city>City3</city>
</address>
<address>
<coutry>Country4</coutry>
<zip>ZIP4</zip>
<city>City4</city>
</address>
<document>
<num>103</num>
<issued>2020-01-03</issued>
<description>desc3</description>
</document>
<document>
<num>104</num>
<issued>2020-01-04</issued>
<description>desc4</description>
</document>
</individual>
</collection>';
exec sp_xml_preparedocument #idoc out, #xml;
select
id as ID
, address as AddressesJson
, document as DocumentsJson
from openxml(#idoc, '//individual', 2) with (
id int
, address nvarchar(max)
, document nvarchar(max)
);
exec sp_xml_removedocument #idoc;
The rusult I'm getting is
|ID |AddressesJson |DocumentsJson |
|---|-------------------|-------------------|
|1 |Country1ZIP1City1 |1012020-01-01desc1 |
|2 |Country3ZIP3City3 |1032020-01-03desc3 |
What I would like to get is
|ID |AddressesJson |DocumentsJson |
|---|-------------------|-------------------|
|1 |[{"coutry":"Country1","zip":"ZIP1","city":"City1"},{"coutry":"Country2","zip":"ZIP2","city":"City2"}] |[{"num":"101","issued":"2020-01-01","description":"desc1"},{"num":"102","issued":"2020-01-02","description":"desc2"}] |
|2 |[{"coutry":"Country3","zip":"ZIP3","city":"City3"},{"coutry":"Country4","zip":"ZIP4","city":"City4"}] |[{"num":"103","issued":"2020-01-03","description":"desc3"},{"num":"104","issued":"2020-01-04","description":"desc4"}] |
How can I achieve this?
P.S. I'm using OpenXML because it seems to work faster. I would also appreciate a solution with xml.nodes()/xquery
Seems a couple of subqueries and a JSON PATH is what you want here. Note, as well, I had to amend your xml to remove the leading line break, as that actually makes the value an invalid xml value:
DECLARE #idoc int,
#xml xml = '<?xml version="1.0" encoding="UTF-8"?>
<collection>
<individual>
<id>1</id>
<address>
<coutry>Country1</coutry>
<zip>ZIP1</zip>
<city>City1</city>
</address>
<address>
<coutry>Country2</coutry>
<zip>ZIP2</zip>
<city>City2</city>
</address>
<document>
<num>101</num>
<issued>2020-01-01</issued>
<description>desc1</description>
</document>
<document>
<num>102</num>
<issued>2020-01-01</issued>
<description>desc2</description>
</document>
</individual>
<individual>
<id>2</id>
<address>
<coutry>Country3</coutry>
<zip>ZIP3</zip>
<city>City3</city>
</address>
<address>
<coutry>Country4</coutry>
<zip>ZIP4</zip>
<city>City4</city>
</address>
<document>
<num>103</num>
<issued>2020-01-03</issued>
<description>desc3</description>
</document>
<document>
<num>104</num>
<issued>2020-01-04</issued>
<description>desc4</description>
</document>
</individual>
</collection>';
SELECT c.i.value('(id/text())[1]','int') AS id,
(SELECT i.a.value('(coutry/text())[1]','varchar(30)') AS country, --It's spelt country, I suggest fixing this at your source, as fundament typographical errors like this can be a real problem later down the line
i.a.value('(zip/text())[1]','varchar(30)') AS zip,
i.a.value('(city/text())[1]','varchar(30)') AS city
FROM c.i.nodes('address')i(a)
FOR JSON PATH) AS AddressJson,
(SELECT i.d.value('(num/text())[1]','int') AS num,
i.d.value('(issued/text())[1]','date') AS issued,
i.d.value('(description/text())[1]','varchar(30)') AS description
FROM c.i.nodes('document')i(d)
FOR JSON PATH) AS DocumentJson
FROM #xml.nodes('collection/individual') c(i);
db<>fiddle

Best ETL tool for converting XML to a table

I need to convert >500 XML's to tables that I can query. I have the XSD that I use to verify the structure. I was considering using notepad++ to structure the files. Is that a good idea, if not what is better? The end goal is either flatfiles with the same columns or directly to SQL
Example #1 XML
(...)
<Customer>
<CustomerID>1</CustomerID>
<Address>
<Street>John Street</Street>
<Number>6</Number>
<Apartment>68</Apartment>
<City>New York</City>
<Zip>10068</Zip>
</Address>
<Firstname>John</Firstname>
<LastName>Doe<LastName/>
</Customer>
(...)
Example #2 XML
(...)
<Customer>
<CustomerID>2</CustomerID>
<Address>
<Street>Wall Street</Street>
<City>New York</City>
</Address>
<Firstname>James Smith</Firstname>
</Customer>
(...)
Example #3 XML
(...)
<n1:Customer>
<n1:CustomerID>3</n1:CustomerID>
<n1:Address>
<n1:Apartment>32</n1:Apartment>
<n1:City>Chicago</n1:City>
</n1:Address>
</n1:Customer>
(...)

Insert XML child node to SQL table

I've got an XML file like this and I'm working with SQL 2014 SP2
<?xml version='1.0' encoding='UTF-8'?>
<gwl>
<version>123456789</version>
<entities>
<entity id="1" version="123456789">
<name>xxxxx</name>
<listId>0</listId>
<listCode>Oxxx</listCode>
<entityType>08</entityType>
<createdDate>03/03/1993</createdDate>
<lastUpdateDate>05/06/2011</lastUpdateDate>
<source>src</source>
<OriginalSource>o_src</OriginalSource>
<aliases>
<alias category="STRONG" type="Alias">USCJSC</alias>
<alias category="WEAK" type="Alias">'OSKOAO'</alias>
</aliases>
<programs>
<program type="21">prog</program>
</programs>
<sdfs>
<sdf name="OriginalID">9876</sdf>
</sdfs>
<addresses>
<address>
<address1>1141, SYA-KAYA STR.</address1>
<country>RU</country>
<postalCode>1234</postalCode>
</address>
<address>
<address1>90, MARATA UL.</address1>
<country>RU</country>
<postalCode>1919</postalCode>
</address>
</addresses>
<otherIds>
<childId>737606</childId>
<childId>737607</childId>
</otherIds>
</entity>
</entities>
</gwl>
I made a script to insert data from the XML to a SQL table. How can I insert child node into a table? I think I should replicate the row for each new child node but i don't know the best way to proceed.
Here is my SQL code
DECLARE #InputXML XML
SELECT #InputXML = CAST(x AS XML)
FROM OPENROWSET(BULK 'C:\MyFiles\sample.XML', SINGLE_BLOB) AS T(x)
SELECT
product.value('(#id)[1]', 'NVARCHAR(10)') id,
product.value('(#version)[1]', 'NVARCHAR(14)') ID
product.value('(name[1])', 'NVARCHAR(255)') name,
product.value('(listId[1])', 'NVARCHAR(9)')listId,
product.value('(listCode[1])', 'NVARCHAR(10)')listCode,
product.value('(entityType[1])', 'NVARCHAR(2)')entityType,
product.value('(createdDate[1])', 'NVARCHAR(10)')createdDate,
product.value('(lastUpdateDate[1])', 'NVARCHAR(10)')lastUpdateDate,
product.value('(source[1])', 'NVARCHAR(15)')source,
product.value('(OriginalSource[1])', 'NVARCHAR(50)')OriginalSource,
product.value('(aliases[1])', 'NVARCHAR(50)')aliases,
product.value('(programs[1])', 'NVARCHAR(50)')programs,
product.value('(sdfs[1])', 'NVARCHAR(500)')sdfs,
product.value('(addresses[1])', 'NVARCHAR(50)')addresses,
product.value('(otherIDs[1])', 'NVARCHAR(50)')otherIDs
FROM #InputXML.nodes('gwl/entities/entity') AS X(product)
You have a lot of different children here...
Just to show the principles:
DECLARE #xml XML=
N'<gwl>
<version>123456789</version>
<entities>
<entity id="1" version="123456789">
<name>xxxxx</name>
<listId>0</listId>
<listCode>Oxxx</listCode>
<entityType>08</entityType>
<createdDate>03/03/1993</createdDate>
<lastUpdateDate>05/06/2011</lastUpdateDate>
<source>src</source>
<OriginalSource>o_src</OriginalSource>
<aliases>
<alias category="STRONG" type="Alias">USCJSC</alias>
<alias category="WEAK" type="Alias">''OSKOAO''</alias>
</aliases>
<programs>
<program type="21">prog</program>
</programs>
<sdfs>
<sdf name="OriginalID">9876</sdf>
</sdfs>
<addresses>
<address>
<address1>1141, SYA-KAYA STR.</address1>
<country>RU</country>
<postalCode>1234</postalCode>
</address>
<address>
<address1>90, MARATA UL.</address1>
<country>RU</country>
<postalCode>1919</postalCode>
</address>
</addresses>
<otherIds>
<childId>737606</childId>
<childId>737607</childId>
</otherIds>
</entity>
</entities>
</gwl>';
-The query will fetch some values from several places.
--It should be easy to get the rest yourself...
SELECT #xml.value('(/gwl/version/text())[1]','bigint') AS [version]
,A.ent.value('(name/text())[1]','nvarchar(max)') AS [Entity_Name]
,A.ent.value('(listId/text())[1]','int') AS Entity_ListId
--more columns taken from A.ent
,B.als.value('#category','nvarchar(max)') AS Alias_Category
,B.als.value('text()[1]','nvarchar(max)') AS Alias_Content
--similar for programs and sdfs
,E.addr.value('(address1/text())[1]','nvarchar(max)') AS Address_Address1
,E.addr.value('(country/text())[1]','nvarchar(max)') AS Address_Country
--and so on
FROM #xml.nodes('/gwl/entities/entity') A(ent)
OUTER APPLY A.ent.nodes('aliases/alias') B(als)
OUTER APPLY A.ent.nodes('programs/program') C(prg)
OUTER APPLY A.ent.nodes('sdfs/sdf') D(sdfs)
OUTER APPLY A.ent.nodes('addresses/address') E(addr)
OUTER APPLY A.ent.nodes('otherIds/childId') F(ids);
The idea in short:
We read non-repeating values (e.g. version) from the xml variable directly
We use .nodes() to return repeating elements as derived sets.
We can use a cascade of .nodes() to dive deeper into repeating child elements by using a relativ Xpath (no / at the beginning).
You have two approaches:
Read the XML like above into a staging table (simply by adding INTO #tmpTable before FROM) and proceed from there (will need one SELECT ... GROUP BY for each type of child).
Create one SELECT per type of child, using only one of the APPLY lines and shift the data into specific child tables.
I would tend to the first one.
This allows to do some cleaning, generate IDs, check for business rules, before you shift this into the target tables.

Call a procedure or function in Oracle DB with return = array of UDT from WSO2 DSS

I follow this post[1] as a guide to build an example of query an array of UDT in WSO2 DSS. In the post just query an UDT, my config try to query an UDT array.
I created this in my DB, a dummy PROCEDURE to try this:
create or replace
TYPE "LIST_CUSTOMERS" IS TABLE OF customer_t
CREATE OR REPLACE
PROCEDURE getCustomer2(listcust OUT list_customers) IS
cust customer_t;
cust2 customer_t;
BEGIN
listcust := list_customers();
cust := customer_t(1, 'prabath');
cust2 := customer_t(2, 'jorge');
listcust.extend;
listcust(1) := cust;
listcust.extend;
listcust(2) := cust2;
END;
My DS is this:
<?xml version="1.0" encoding="UTF-8"?>
<data name="UDTSample2">
<config id="default">
<property name="org.wso2.ws.dataservice.driver">oracle.jdbc.driver.OracleDriver</property>
<property name="org.wso2.ws.dataservice.protocol">jdbc:oracle:thin:#localhost:1521:DBMB</property>
<property name="org.wso2.ws.dataservice.user">****</property>
<property name="org.wso2.ws.dataservice.password">****</property>
</config>
<query id="q3" useConfig="default">
<sql>call getCustomer2(?)</sql>
<result element="customers">
<element name="customer" arrayName="custArray" column="cust" optional="true"/>
</result>
<param name="cust" paramType="ARRAY" sqlType="ARRAY" type="OUT" structType="LIST_CUSTOMERS" />
</query>
<operation name="op3">
<call-query href="q3" />
</operation>
</data>
ant return:
<customers xmlns="http://ws.wso2.org/dataservice">
<customer>{1,prabath}</customer>
<customer>{2,jorge}</customer>
</customers>
but I want something like this:
<customers xmlns="http://ws.wso2.org/dataservice">
<customer>
<id>1</id>
<name>prabath<name>
</customer>
<customer>
<id>2</id>
<name>Jorge<name>
</customer>
</customers>
How can I accomplish this?
[1] http://prabathabey.blogspot.com/2012/05/query-udtsuser-defined-types-with-wso2.html
Not sure whether this kind of transformation can be done at DSS level because DSS gives back what it recieves from database. Better use WSO2 esb for this kind of transformation.
By the moment it's not possible to accomplish this scenario using just DSS. the DS response must be send to the WSO2 ESB to do the corresponding transformation before send the response to the client. A JIRA was created to do this in the future https://wso2.org/jira/browse/DS-1104
As a workaround, you can use a procedure returning a sys_refcursor.
It would look like this:
PROCEDURE getCustomer_CUR(cur_cust OUT SYS_REFCURSOR)
l_cust LIST_CUSTOMERS;
IS
-- Retrieve cust list:
getCustomer2(l_cust);
OPEN cur_cust for
select cast(multiset(select * from TABLE(l_cust)) as customer_t) from dual;
...
END;
Then you can do your DSS mapping something like:
<sql>call getCustomer_CUR(?)</sql>
<result element="customers">
<element arrayName="custArray" name="Customers">
<element column="custArray[0]" name="col0" xsdType=.../>
...
</element>
</result>
<param name="cust" sqlType="ORACLE_REF_CURSOR" type="OUT"/>
It is tedious but it works.

How to get multiple nodes under 1 single node with T-SQL

My xml file looks something like this:
<PackageRuntimeContext xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<UserToken>
<Id>449694</Id>
</UserToken>
<Addresses>
<Address>
<LastSeen xsi:nil="true" />
<UniqueID>9afd29f6-f4fe-4a91-aade-da8a3fcdc358</UniqueID>
<IsPrimary>true</IsPrimary>
<Id>0</Id>
<OrderID>0</OrderID>
<SubjectId>0</SubjectId>
<AddressLine1>123 Main St.</AddressLine1>
<City>louisville</City>
<State>KY</State>
<ZipCode>40206</ZipCode>
</Address>
<Address>
<LastSeen xsi:nil="true" />
<UniqueID>0ae8014e-a950-48f3-8ee6-3526a7f3a50d</UniqueID>
<IsPrimary>true</IsPrimary>
<Id>0</Id>
<OrderID>0</OrderID>
<SubjectId>0</SubjectId>
<AddressLine1>789 Elm St.</AddressLine1>
<City>louisville</City>
<State>KY</State>
<ZipCode>40206</ZipCode>
</Address>
<Address>
<LastSeen xsi:nil="true" />
<UniqueID>b1bcc271-bec8-432f-b968-25430ba63b95</UniqueID>
<IsPrimary>false</IsPrimary>
<Id>0</Id>
<OrderID>0</OrderID>
<SubjectId>0</SubjectId>
<AddressLine1>456 Oak St.</AddressLine1>
<City>louisville</City>
<State>KY</State>
<ZipCode>40206</ZipCode>
</Address>
</Addresses>
I want to get the <Id> number 449694, and with it, the 3 (or whatever) subsequent <UniqueID> numbers under Addresses/Address so it looks something like this:
IDNumber UniqueID
======== ========
449694 9afd29f6-f4fe-4a91-aade-da8a3fcdc358
449694 0ae8014e-a950-48f3-8ee6-3526a7f3a50d
449694 b1bcc271-bec8-432f-b968-25430ba63b95
The code If found here (How to query values from xml nodes?) directed me to write something like this:
SELECT
t.p.value('(./UserToken/Id)[1]', 'int') [IdNumber],
t.p.value('(./Addresses/Address/UniqueID)[1]', 'varchar(max)') [Context]
FROM product.PackageRuntimeState prs WITH(NOLOCK)
CROSS APPLY prs.Context.nodes('/PackageRuntimeContext') t(p)
My results were:
IDNumber UniqueID
======== ========
449694 9afd29f6-f4fe-4a91-aade-da8a3fcdc358
449694 b8439471-d4b9-46db-9321-b6175e1b8fb4 (this is from ANOTHER record)
449694 b8439471-d4b9-46db-9321-b6175e1b8fb4 (this too is from another record)
What do I need to do to my code to get the subsequent UniqueID nodes from my xml file?
Thanks!
Drop down one more level. You need to list the direct decendants of <Addresses>, not <PackageRuntimeContext>
SELECT
t.p.value('(../../UserToken/Id)[1]', 'int') [IdNumber],
t.p.value('(./UniqueID)[1]', 'varchar(max)') [Context]
FROM product.PackageRuntimeState prs WITH(NOLOCK)
CROSS APPLY prs.Context.nodes('/PackageRuntimeContext/Addresses/Address') t(p)

Resources