Getting xml Element from variant column of a Snwoflake table - snowflake-cloud-data-platform

This is my Sample data
ID VERSION ACT_TYPE EVE_TYPE CLI_ID DETAILS OBJ_TYPE DATE_TIME AAPP_EVENT_TO_UTC_DT GRO_ID OBJECT_NAME OBJ_ID USER_NAME USER_ID EVENT_ID FINDINGS SUMMARY
6tgbcrq9pfhj1ezsdo82mcrzz o SCREENED_CASE WORLDCHECK o <?xml version="1.0" encoding="UTF-8" standalone="yes"?><testPayload><testId>565656-21cf-4c7e-8071-574a1ef78981</testId><testCode>COMPLETED</testCode><testState>TEST</testState><testResults>1</testResults><testRequiredResults>0</testRequiredResults><testExcludedResults>0</testExcludedResults><testAutoResolvedResults>1</testAutoResolvedResults><testproviderTypes>WATCHLIST</testproviderTypes></testPayload> CASE 9/16/2020 9:45 9/16/2020 9:45 erutrt7-d726-4672-8599-83d21927bec5 o 5786765dfgdfgdfg System User USER_SYSTEM o o <?xml version="1.0" encoding="UTF-8" standalone="yes"?><testCaseEventSummary><testTypes>WATCHLIST</testTypes><testResults>1</testResults></testCaseEventSummary>
I need to get ID ,testId from this table .
Please note testId is inside the DETAIL column which is an xml .
I was trying something like below but its not working
select DETAILS:"$" from audit_event;

First you need to convert it to xml with parse_xml and then use the xmlget function get the testId field like this:
select xmlget(parse_xml(details), 'testId'):"$"::varchar from audit_event
I assumed you wanted it back as a varchar so I put the ::varchar at the end to do that.

Related

Querying selection of values in XML in SQL Server, on table of mixed values

I have a table with people who have bought tickets for a charity evening event, and the table contains details of registration event, and the XML will show guests they are bringing with them, but also details of any dietary requirements, and the occasional person who might be disabled. This is supposed to be pushed to our CRM system but this is not currently working.
I'm trying to extract some values out of some XML which is in a column in our import table.
I've seen plenty of examples of querying ordinary chunks of XML, but not when the XML is inside a table with other normal INT and VARCHAR values.
We are using SQL Server 2014. I've spent hours googling but haven't the faintest idea on making a query that combined the two together. Or even if I'm supposed to push the XML stuff into a temp table which I could then do a join with.
Declare #xmlstring xml = '<field_import_admin_event_tickets xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<und is_array="true">
<item>
<value>8463</value>
<revision_id>4763</revision_id>
</item>
</und>
</field_import_admin_event_tickets>'
select
MainDataCenter.Col.value('(value)[1]', 'varchar(max)') as Name,
MainDataCenter.Col.value('(revision_id)[1]', 'varchar(max)') as Value
from
#xmlstring.nodes('/field_import_admin_event_tickets/und/item') as MainDataCenter(Col)
^ this will work
but I need to query it along with this:-
SELECT *
FROM [importtickets].[bcc].[entityform]
WHERE type LIKE '%show%'
AND createdDATETIME > '2019-03-14'
AND LEN(CAST(field_import_admin_event_tickets AS VARCHAR(MAX)) ) >1
-- bodging a way of seeing if XML code exists or not, doesn't seem to work with IS NOT NULL
AND Jobstatus = 'completed'
The only way I can crudely get values out of the XML is CAST it to a VARCHAR and use lots of REPLACE commands to strip out the XML tags to get it down to the values. There may be 2 to 18 numeric values in each lump of XML
This is my first post on StackOverflow and I've spent days searching on this, so please be gentle with me. Thanks.
2019-07-10 Hey, so I didn't make this fully clear.
each column of XML (a few are nulls) contains 2 - 34 separate numbers in. I dd some crude manipulation of data by CASTing this into VARCHAR and running lots of replace commands to understand it better.
this is the largest example here of some XML, 34 integer values, 17 are 'value' and 17 are 'revision_id'
So I then pushed this all into a new table using lots of SUBSTRING. This is crude but effective, but assumes each value is five digits long (it is so far) my boss is not keen on this solution though.
crudely shredded XML using CAST to VARCHAR and tags manually stripped out
I just need each sets of values extracted in each row so I can then do a JOIN or subquery to them, with a row or something identifiable. The numbers will refer to a guest who is coming to some charity events which will have some attributes such as dietary requirements or disability.
I don't know, if this is the very best approach for your issue, but I hope that I got your question correctly, that you want to combine the working query against an isolated XML with the tabular query, where the XML is the content of a column:
First of all I create a mockup with two rows
DECLARE #mockupTable TABLE(ID INT IDENTITY,SomeOtherValue VARCHAR(100),YourXml XML);
INSERT INTO #mockupTable(SomeOtherValue,YourXml) VALUES
('This is some value in row 1'
,'<field_import_admin_event_tickets xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<und is_array="true">
<item>
<value>8463</value>
<revision_id>4763</revision_id>
</item>
</und>
</field_import_admin_event_tickets>')
,('This is some value in row 2'
,'<field_import_admin_event_tickets xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<und is_array="true">
<item>
<value>999</value>
<revision_id>888</revision_id>
</item>
</und>
</field_import_admin_event_tickets>');
--The query
SELECT t.ID
,t.SomeOtherValue
,MainDataCenter.Col.value('(value)[1]', 'varchar(max)') as Name
,MainDataCenter.Col.value('(revision_id)[1]', 'varchar(max)') as Value
FROM #mockupTable t
CROSS APPLY t.YourXml.nodes('/field_import_admin_event_tickets/und/item') as MainDataCenter(Col);
The result
ID SomeOtherValue Name Value
1 This is some value in row 1 8463 4763
2 This is some value in row 2 999 888
The idea in short:
APPLY allows to call a table-valued function row-wise. In this case we hand in the content of a column (in your case the XML) into the built-in function .nodes().
Similar to a JOIN we get a joined set, which adds columns (and rows) to the final set. We can use the .value() method to retrieve the actual values from the XML.
If this is the best approach? I don't know...
Your sample above shows just one single <item>. .nodes() would be needed to return several <item> elements in a derived set. With just one <item> this could be done more easily using .value() directly...

Find if a list contains several values - SQL Server Xquery

I am storing a XML data into a table called BikeTable. The XML data is coming from an object that is being serialized using .Net serializer.
BikeTable would look like this :
Id - UniqueIdentifier
XmlData - XML
The XML stored in the XmlData column looks like this :
Record 1 :
<Bike>
<Material>
<Cage>EIECH</Cage>
<Mpn>B258-C436-B001</
</Material>
<Roles>
<string>Race</string>
<string>Mountain</string>
<string>City</string>
</Roles>
</Bike>
Record 2 :
<Bike>
<Material>
<Cage>ABCDE</Cage>
<Mpn>B258-C436-B001</Mpn>
</Material>
<Roles>
<string>Race</string>
</Roles>
</Bike>
I want to be able to find the records in my table that will contain for example Race and Mountain.
Example if I want the Ids of the record that contains 'Road'and 'Mountain" the only way I found is like this :
select Id
from BikeTable
where XmlData.exist('/Bike/Roles/string[contains(., "Road")]') = 1
or XmlData.exist('/Bike/Roles/string[contains(., "Mountain")]') = 1
I don't like this option because it forces me to generate the query if I want to find records that would match one or several roles.
Roles can contains unlimited number of values and I need to be able to find the records that will one or more values.
Ex : records containing Race, records containing Race or Montain, records containing City, records containing City and Mountain etc.
Is there any way to know if a list contains several values?
Yes, you can. This is a bit of a guess though, as you say you want to do a SELECT *; something that is impossible to provide any data for without the DDL of the table. Thus, instead, I've returned the Cage and Mpn of the Bike:
CREATE TABLE BikeTable (xmlData xml);
--The Close tag for Mpn was missing in your sample data, I assume it wasn't mean to be
INSERT INTO BikeTable
VALUES('<Bike>
<Material>
<Cage>EIECH</Cage>
<Mpn>B258-C436-B001</Mpn>
</Material>
<Roles>
<string>Race</string>
<string>Mountain</string>
<string>City</string>
</Roles>
</Bike>')
GO
WITH Bikes AS (
SELECT B.Material.value('(Cage/text())[1]','varchar(15)') AS Cage, --Data Type guessed
B.Material.value('(Mpn/text())[1]','varchar(15)') AS Mpn, --Data Type guessed
BR.String.value('(./text())[1]','varchar(15)') AS String --Data Type guessed
FROM BikeTable BT
CROSS APPLY BT.xmlData.nodes('/Bike/Material') B(Material)
CROSS APPLY BT.xmlData.nodes('/Bike/Roles/string') BR(String))
SELECT Cage, Mpn
FROM Bikes
GROUP BY Cage, Mpn
HAVING COUNT(String) > 1;
GO
DROP TABLE BikeTable;

Avoid double XML INSERT to SQL

I need to import XML data into SQL Server 2012. The import works correctly, but I would want to avoid double import. I already tried with WHERE NOT EXISTS but it didn't work.
The import:
INSERT INTO dbo.tXMLImport(cText)
SELECT cast(CONVERT(XML,x.BulkColumn,2) AS varchar(max))
FROM OPENROWSET (BULK 'D:\XML\Data.xml', SINGLE_BLOB) AS x
EXL file content:
<?xml version="1.0" encoding="UTF-8"?>
<tOrder>
<cName>Name1</cName>
<cID>100</cID>
</tOrder>
Now, it should be checked if cID value 100 from XML file already exist in
dbo.tOrder row cOrderNumber
cOrderNumber
1 100
2 101
3 102
Following extention does not wokr:
WHERE NOT EXISTS(SELECT *
FROM dbo.tOrder
WHERE x.value('(/tOrder/cID)') = dbo.tOrder.CorderNumber)
If yes, no Import to be done. Maybe some one can support me with?
Thanks in advance.
I'm not sure if I really get this... If the same cOrderNumber exists already, wouldn't you try to update the existing row? Something like you'd do with MERGE?
But It might be something like this what you are looking for:
WHERE NOT EXISTS(SELECT 1 FROM dbo.tOrder
WHERE x.exist(N'/tOrder[cID/text()=sql:column("cOrderNumber")])')=1)
(Untested air code)
This looks if there is any record within tOrder where the XML column x has any occurance of a node <tOrder><CID> with a value like the current cOrderNumber's value.
T-SQL adds the sql:column() method to XQuery, which allows to use the value of a row within the query. There's sql:variable() too.
The xml's method .exist() checks the XML for any existance of a given condition and returns with 0 or 1.
UPDATE
After reading your question once again, I'm not sure if I got this correctly... Please check the following. If this doesn't help, please use my code to set up a stand-alone sample to reprodcue your issue:
A dummy table with some orders
DECLARE #YourTable TABLE(cOrderNumber INT, OrderName VARCHAR(100));
INSERT INTO #YourTable VALUES
(100,'Order 100')
,(200,'Order 200')
,(300,'Order 300')
--Try to insert an XML with the existing OrderNumber=100
DECLARE #xml100 XML=
'<tOrder>
<cName>Name1</cName>
<cID>100</cID>
</tOrder>';
INSERT INTO #YourTable(cOrderNumber,OrderName)
SELECT #xml100.value('(/tOrder/cID/text())[1]','int')
,#xml100.value('(/tOrder/cName/text())[1]','varchar(100)')
WHERE NOT EXISTS(SELECT 1 FROM #YourTable AS t2
WHERE t2.cOrderNumber=#xml100.value('(/tOrder/cID/text())[1]','int'));
--Same code as above, but the order number is now a not existing number
DECLARE #xml101 XML=
'<tOrder>
<cName>Name1</cName>
<cID>101</cID>
</tOrder>';
INSERT INTO #YourTable(cOrderNumber,OrderName)
SELECT #xml101.value('(/tOrder/cID/text())[1]','int')
,#xml101.value('(/tOrder/cName/text())[1]','varchar(100)')
WHERE NOT EXISTS(SELECT 1 FROM #YourTable AS t2
WHERE t2.cOrderNumber=#xml101.value('(/tOrder/cID/text())[1]','int'));
--check the result
SELECT *
FROM #YourTable;
nr name
-------------
100 Order 100
200 Order 200
300 Order 300
101 Name1

Diff on SQL Server XML Data Type?

I have an automated process that inserts an XML document into SQL Server 2008 table, the column is of Type XML. There is a lot of duplicated data, I wonder if anyone can recommend a good way to delete non-distinct values based on the XML column? The table has thousands of rows and each XML document is about 70k.
Each XML document looks the same except for one element value, for example:
Row 1 , Column C:
<?xml version="1.0"?><a><b/><c>2010.09.28T10:10:00</c></a>
Row 2, Column C:
<?xml version="1.0"?><a><b/><c>2010.09.29T10:10:00</c></a>
I want to pretend that the value of is ignored when it comes to the diff. If everything else is equal, then I want to consider the documents to be the same. If any other element is different, then the documents would be considered different.
Thanks for all ideas.
Can you qualify what 'distinct XML' means for you? For example what is the difference between:
<a><b/></a>
<?xml version="1.0"?><a><b/></a>
<a xmlns:xhtml="http://www.w3.org/1999/xhtml"><b/></a>
<a><b xsi:nil="true" /></a>
<a><b></b></a>
<?xml version="1.0" encoding="UTF-8"?><a><b/></a>
<?xml version="1.0" encoding="UTF-16"?><a><b></b></a>
In your opinion, how many 'distinct' XMLs are there?
Updated
If your XML looks like: <?xml version="1.0"?><a><b/><c>2010.09.29T10:10:00</c></a> then you can project the element that distinguish the fields and query on this projection:
with cte_x as (
select xmlcolumn.value(N'(//a/c)[1]', N'DATETIME') as xml_date_a_c,
...
from table
),
cte_rank as (
select row_number() over (partition by xml_date_a_c order by ...) as rn
from cte_x)
delete from cte_rank
where rn > 1;

SQL Server 2005 - searching for value in XML field

I'm trying to query a particular value in an XML field. I've seen lots of examples, but they don't seem to be what I'm looking for
Supposing my xml field is called XMLAttributes and table TableName, and the complete xml value is like the below:
<Attribute name="First2Digits" value="12" />
<Attribute name="PurchaseXXXUniqueID" value="U4RV123456762MBE79" />
(although the xml field will frequently have other attributes, not just PurchaseXXXUniqueID)
If I'm looking for a specific value in the PurchaseXXXUniqueID attribute name - say U4RV123456762MBE79 - how would I write the query? I believe it would be something like:
select *
from TableName
where XMLAttributes.value('(/path/to/tag)[1]', 'varchar(100)') = '5FTZP2QT8Z3E2MAV2D'
... but it's the path/to/tag that I need to figure out.
Or probably there's other ways of getting the values I want.
To summarize - I need to get all the records in a table where the value of a particular attribute in the xml field matches a value I'll pass to the query.
thanks for the help!
Sylvia
edit: I was trying to make this simpler, but in case it makes a difference - ultimately I'll have a temporary table of 50 or so potential values for the PurchaseXXXUniqueID field. For these, I want to get all the matching records from the table with the XML field.
This ought to work:
SELECT
(fields from base table),
Nodes.Attr.value('(#name)[1]', 'varchar(100)'),
Nodes.Attr.value('(#value)[1]', 'varchar(100)')
FROM
dbo.TableName
CROSS APPLY
XMLAttributes.nodes('/Attribute') AS Nodes(Attr)
WHERE
Nodes.Attr.value('(#name)[1]', 'varchar(100)') = 'PurchaseXXXUniqueID'
AND Nodes.Attr.value('(#value)[1]', 'varchar(100)') = 'U4RV123456762MBE79'
You basically need to join the base table's row against one "pseudo-row" for each of the <Attribute> nodes inside the XML column, and the pick out the individual attribute values from the <Attribute> node to select what you're looking for.
Something like that?
declare #PurchaseXXXUniqueID varchar(max)
set #PurchaseXXXUniqueID = 'U4RV123456762MBE79';
select * from TableName t
where XMLAttributes.exist('//Attribute/#value = sql:variable("#PurchaseXXXUniqueID")') = 1

Resources