Not able to parse JSON in synapse SQL with OPENJSON

Not able to parse JSON in synapse SQL with OPENJSON - sql-server

This question has a reference to my earlier SO thread. Here it is.
In a nutshell, I am trying to parse a JSON input in Synapse SQL.
DECLARE #json nvarchar(max)
SET #json= '{"value": "{\"value\":[{\"ERDAT\":\"20210511\"},{\"ERDAT\":\"20210511\"},
{\"ERDAT\":\"20210511\"},{\"ERDAT\":\"20210511\"},"type": "String"}';
DECLARE #ReplacedDetails nvarchar(max), #ReplacedStringDetails nvarchar(max)
SET #ReplacedDetails = REPLACE(LTRIM(RTRIM(#json)),'\','');
SET #ReplacedStringDetails = REPLACE(#ReplacedDetails,',"type": "String"','');
SELECT #ReplacedStringDetails
CREATE TABLE #ValueTable_15
(
ColumnName varchar(200),
LastUpdatedValue varchar(200)
);
INSERT INTO #ValueTable_15 (ColumnName,LastUpdatedValue)
SELECT TOP(1) j2.[key],TRY_PARSE(j2.[value] as bigint) AS LastUpdatedValue
FROM OPENJSON(#ReplacedStringDetails, '$.value.value') j1
CROSS APPLY OPENJSON(j1.value) j2
ORDER BY LastUpdatedValue DESC;
Then when I am running the above query, I am getting error:
Microsoft][ODBC Driver 17 for SQL Server][SQL Server]JSON text is not properly formatted. Unexpected character 'v' is found at position 13
When I am simply trying to SELECT #ReplaceStringDetails it is giving the expected results.
What I am missing here?
P.S. I replaced $.value.value with simple $.value, but yielding no result.

As noted in the comments it appears that the characters ]}" got removed before ,"type": "String"}. If you correct that edit you should be able to parse the JSON using the normal OPENJSON() table-valued function, e.g.:
DECLARE #json nvarchar(max)
SET #json= '{"value": "{\"value\":[{\"ERDAT\":\"20210511\"},{\"ERDAT\":\"20210511\"},{\"ERDAT\":\"20210511\"},{\"ERDAT\":\"20210511\"}]}","type": "String"}';
SELECT [embedded].*
FROM OPENJSON(#json) WITH (
[value] nvarchar(max) --nope: AS JSON
) [wrapper]
CROSS APPLY OPENJSON(wrapper.value, '$.value') WITH (
[ERDAT] nvarchar(10)
) [embedded];
Which returns the results...
ERDAT
20210511
20210511
20210511
20210511

Related

SQL Server 2016 extract info from XML

I've been through various posts on the same subject but I can't seem to be able to get to the data elements in my XML file.
Here is a snippet of my XML :
<ed:Certificate xmlns="http://sancrt.mpi.govt.nz/ecert/2013/ed-multiple-submission-schema.xsd" xmlns:ed="http://sancrt.mpi.govt.nz/ecert/2013/ed-submission-schema.xsd"> <ed:Status Code="39">Approved</ed:Status> <ed:LastUpdatedDate>2021-03-10T14:20:55+13:00</ed:LastUpdatedDate> <ed:Identifiers>
<ed:CertificateID>NZL2021/MEABC/26913T</ed:CertificateID>
<ed:TemplateID>ED1.6</ed:TemplateID> </ed:Identifiers> <ed:Exhausted>true</ed:Exhausted> <ed:AutoApproval>false</ed:AutoApproval> <ed:DepartureDate>2021-03-10</ed:DepartureDate> <ed:Parties>
<ed:ConsignorID>MEABC</ed:ConsignorID>
<ed:ConsigneeID>FLIGHT1</ed:ConsigneeID> </ed:Parties> <ed:Transport>
<ed:Ports>
<ed:LoadingPortID>NZTRG</ed:LoadingPortID>
</ed:Ports>
<ed:FinalDestination>OAKLAND, United States</ed:FinalDestination>
<ed:TransportMode>1</ed:TransportMode>
<ed:LocalCarrier>MDH2</ed:LocalCarrier>
<ed:CarrierName> Ever Given</ed:CarrierName>
<ed:ConveyanceReference>V1234</ed:ConveyanceReference> </ed:Transport> <ed:Remarks>
<ed:Remark>
<ed:RemarkType>Unofficial Information</ed:RemarkType>
<ed:RemarkValue>Vessel ETD - 19/03/21\nTARE WEIGHT - 2880 KGS</ed:RemarkValue>
</ed:Remark> </ed:Remarks> <ed:Products>
<ed:Product>
<ed:ProductItem>1</ed:ProductItem>
<ed:Exhausted>true</ed:Exhausted>
<ed:Origin>AO</ed:Origin>
<ed:Description>BONELESS BEEF RUMP CAP</ed:Description>
<ed:CommonName>Bovine</ed:CommonName>
<ed:EligibilityCountries>
<ed:EligibilityCountryID>US</ed:EligibilityCountryID>
</ed:EligibilityCountries>
<ed:IntendedUse>consumption</ed:IntendedUse>
<ed:GrossWeight unitCode="KGM">296.4</ed:GrossWeight>
<ed:NetWeight unitCode="KGM">271.6</ed:NetWeight>
<ed:Remarks>
<ed:Remark>
<ed:RemarkType>Product Statement</ed:RemarkType>
<ed:RemarkValue>Item No. 81625\nLabel Approval 2659305 & 91060858</ed:RemarkValue>
</ed:Remark>
</ed:Remarks>
<ed:Classifications>
<ed:Classification>
<ed:ClassificationType>Temperature</ed:ClassificationType>
<ed:ClassificationValue>chilled</ed:ClassificationValue>
</ed:Classification>
<ed:Classification>
<ed:ClassificationType>New Zealand Harmonised System Code</ed:ClassificationType>
<ed:ClassificationValue>020130</ed:ClassificationValue>
</ed:Classification>
<ed:Classification>
<ed:ClassificationType>Halal Product</ed:ClassificationType>
<ed:ClassificationValue>1</ed:ClassificationValue>
</ed:Classification>
</ed:Classifications>
<ed:Containers>
<ed:Container>
<ed:ID>CGMU3099999</ed:ID>
<ed:Seals>
<ed:ID>NZMPIXXXXX</ed:ID>
</ed:Seals>
</ed:Container>
</ed:Containers>
<ed:Packaging>
<ed:Package>
<ed:Quantity>29</ed:Quantity>
<ed:Type>CT</ed:Type>
<ed:Level>1</ed:Level>
<ed:ShippingMarks>
<ed:Name>MABC\n26913</ed:Name>
</ed:ShippingMarks>
</ed:Package>
</ed:Packaging>
<ed:Processes>
<ed:Process>
<ed:ProcessTypeCode>SLT</ed:ProcessTypeCode>
<ed:StartDate>2021-03-01</ed:StartDate>
<ed:EndDate>2021-03-01</ed:EndDate>
<ed:DateOverride>false</ed:DateOverride>
<ed:Premise>
<ed:ID>MEABC</ed:ID>
</ed:Premise>
</ed:Process>
<ed:Process>
<ed:ProcessTypeCode>PRO</ed:ProcessTypeCode>
<ed:StartDate>2021-03-02</ed:StartDate>
<ed:EndDate>2021-03-02</ed:EndDate>
<ed:DateOverride>false</ed:DateOverride>
<ed:Premise>
<ed:ID>MEABC</ed:ID>
</ed:Premise>
</ed:Process>
<ed:Process>
<ed:ProcessTypeCode>CST</ed:ProcessTypeCode>
<ed:StartDate>2021-03-02</ed:StartDate>
<ed:EndDate>2021-03-10</ed:EndDate>
<ed:DateOverride>false</ed:DateOverride>
<ed:Premise>
<ed:ID>MEABC</ed:ID>
</ed:Premise>
</ed:Process>
</ed:Processes>
</ed:Product>
</ed:Products>
</ed:Certificate>
This is what I have tried so far - Figured if I can access one element, I can slowly work on the rest
if OBJECT_ID('tempdb..#XmlImportTest') is not null
drop table #XmlImportTest
CREATE TABLE #XmlImportTest(
xmlFileName VARCHAR(300) NOT NULL,
xml_data XML NOT NULL
);
DECLARE #xmlFileName VARCHAR(200) ='K:\Upload\CSNXML\WaybillXml.xml'
EXEC('INSERT INTO #XmlImportTest(xmlFileName, xml_data)
SELECT ''' + #xmlFileName + ''', xmlData
FROM(
SELECT *
FROM OPENROWSET (BULK ''' + #xmlFileName + ''', SINGLE_BLOB) AS XMLDATA
) AS FileImport (XMLDATA)
')
DECLARE #XML AS XML, #hDoc AS INT, #SQL NVARCHAR (MAX)
SELECT #xml = (SELECT xml_data from #XmlImportTest)
EXEC sp_xml_preparedocument #hDoc OUTPUT, #XML
DECLARE #XML AS XML, #hDoc AS INT, #SQL NVARCHAR (MAX)
SELECT #xml = (SELECT xml_data from #XmlImportTest)
EXEC sp_xml_preparedocument #hDoc OUTPUT, #XML
;WITH XMLNAMESPACES ('http://sancrt.mpi.govt.nz/ecert/2013/ed-multiple-submission-schema.xsd' AS ed)
SELECT
p.value(N'#ProductItem',N'nvarchar(10)') AS ProductItem
FROM
#xml.nodes('/Certificate')
AS A(p)
CROSS APPLY a.p.nodes(N'Products/Product') AS B(m);
I don't get any results returned.
I get the same result using OPENROWSET as well.
Can someone please tell me how I can access this data element.

You seem to be getting confused about XML Namespaces. The example document defines two namespace URIs:
http://sancrt.mpi.govt.nz/ecert/2013/ed-multiple-submission-schema.xsd, which has no prefix so is considered to be the "default" namespace of the document.
http://sancrt.mpi.govt.nz/ecert/2013/ed-submission-schema.xsd, which uses the ed namespace prefix that, by eyeballing it, seems to be used on every element in the document so might as well be the default namespace.
Your simplest example is trying to extract the value of the /Certificate/Products/Product/ProductItem elements which could be done as simply as:
with xmlnamespaces (
default 'http://sancrt.mpi.govt.nz/ecert/2013/ed-submission-schema.xsd'
)
select productItem.value(N'text()[1]', N'int') as ProductItem
from #xml.nodes('/Certificate/Products/Product/ProductItem') as p(productItem);
Expanding on this to select a few more values, you can see the # being used here to access the unitCode attribute of an element:
with xmlnamespaces (
default 'http://sancrt.mpi.govt.nz/ecert/2013/ed-submission-schema.xsd'
)
select
product.value(N'(ProductItem/text())[1]', N'int') as ProductItem,
product.value(N'(Exhausted/text())[1]', N'bit') as Exhausted,
product.value(N'(Origin/text())[1]', N'nvarchar(2)') as Origin,
product.value(N'(GrossWeight/text())[1]', N'decimal(19,1)') as GrossWeight,
product.value(N'(GrossWeight/#unitCode)[1]', N'nvarchar(3)') as GrossWeightUnitCode
from #xml.nodes('/Certificate/Products/Product') as p(product);
It should be clear from the above two queries that the namespace prefixes used XPath query don't have to be the same as the ones used in the XML document - it's the namespace URIs themselves that matter. The prefixes in the document are used to link the elements (and sometimes attributes) to their namespace URIs, the prefixes used in XPath can be completely different so long as they reference the correct namespace URIs. e.g. this query returns the same result as the second example above, despite their being no submission prefixes in the source XML:
with xmlnamespaces (
'http://sancrt.mpi.govt.nz/ecert/2013/ed-multiple-submission-schema.xsd' as multiple,
'http://sancrt.mpi.govt.nz/ecert/2013/ed-submission-schema.xsd' as submission
)
select
product.value(N'(submission:ProductItem/text())[1]', N'int') as ProductItem,
product.value(N'(submission:Exhausted/text())[1]', N'bit') as Exhausted,
product.value(N'(submission:Origin/text())[1]', N'nvarchar(2)') as Origin,
product.value(N'(submission:GrossWeight/text())[1]', N'decimal(19,1)') as GrossWeight,
product.value(N'(submission:GrossWeight/#unitCode)[1]', N'nvarchar(3)') as GrossWeightUnitCode,
product.value(N'(submission:Remarks/submission:Remark/submission:RemarkType/text())[1]', N'nvarchar(50)') as item_remark
from #xml.nodes('/submission:Certificate/submission:Products/submission:Product') as p(product);

Transact SQL Pivot a Split String into appropriate columns

My company is using a generic logging database among many products. To prevent the need for a lot of cross database queries some info is stored in delimited fields within the generic data columns for the logging.
I'm wanting to write query's on the data, but I'm unsure how to use Pivot/Unpivot to get the data into appropriate columns?
Below is a generic example using static data for what I'm wanting to do, but not sure how to do it. We unfortunately don't have the built in split string function in SqlServer 2016 so dbo.fnSplitString is my written equivalent which works fine.
DECLARE #Columns TABLE (
CustomerNumber VARCHAR(MAX) NOT NULL,
FirstName VARCHAR(MAX) NOT NULL,
LastName VARCHAR(MAX) NOT NULL);
/* This isn't valid SQL ... unsure how to get this to work */
INSERT INTO #Columns PIVOT SELECT * FROM dbo.fnSplitString('STUFF1,STUFF2,STUFF3',',');
SELECT * FROM #Columns;
Edit:
Using the examples here and https://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx I was able to come up with a solution. The split string function also needs to output a position. This was inspired by one of the solutions below just 'OrdinalPosition' needed to be added to the function. The resulting query works.
DECLARE #Columns TABLE (CustomerNumber VARCHAR(MAX) NOT NULL, FirstName VARCHAR(MAX) NOT NULL, LastName VARCHAR(MAX) NOT NULL);
INSERT INTO #Columns select [0], [1], [2] from (SELECT position, splitdata FROM dbo.fnSplitString('STUFF1,STUFF4,STUFF3',',')) split pivot (MAX(splitdata) FOR position in ([0],[1],[2])) piv;
SELECT * FROM #Columns;

Is this what you are looking for:
I just used "item" as column name. Replace it with the name corresponding to your name that is returned from fnSplitString function.
DECLARE #Columns TABLE (CustomerNumber VARCHAR(MAX) NOT NULL, FirstName
VARCHAR(MAX) NOT NULL, LastName VARCHAR(MAX) NOT NULL);
INSERT INTO #Columns
select STUFF1,STUFF2,STUFF3 from (SELECT item FROM
dbo.fnSplitString('STUFF1,STUFF2,STUFF3',',')) d
pivot ( max(item) for item in (STUFF1,STUFF2,STUFF3) ) piv;
SELECT * FROM #Columns;

If your function fnSplitString returns a table then simply
INSERT INTO #Columns
SELECT * FROM dbo.fnSplitString('STUFF1,STUFF2,STUFF3',',')
PIVOT
(
MAX(ParsedValue)
FOR OrdinalPosition in ([0], [1], [2], [3])
) x

Converting nvarchar to numeric

I have variable called #prmclientcode which is nvarchar. The input to this variable can be a single client code or multiple client codes separated by comma. For e.g.
#prmclientcode='1'
or
#prmclientcode='1,2,3'.
I am comparing this variable to a client code column in of the tables. The data type of this column is numeric(6,0). I tried converting the variable data type like below
SNCA_CLIENT_CODE IN ('''+convert(numeric(6,0),#prmclientcode+''')) (The query is inside a dynamic sql).
But when I try executing this I get the error
Arithmetic overflow error converting nvarchar to data type numeric.
Can anyone please help me here!
Thanks!

You need to convert the numeric(6,0) column to nvarchar data type. You can use below scrip to convert it to nvarchar, before processing:
SNCA_CLIENT_CODE IN ('''+convert(cast( numeric(6,0) as nvarchar(max) ),#prmclientcode+'''))

Please try with the below code snippet.
DECLARE #ProductTotals TABLE
(
ProductID int
)
INSERT INTO #ProductTotals VALUES(1)
INSERT INTO #ProductTotals VALUES(11)
INSERT INTO #ProductTotals VALUES(3)
DECLARE #prmclientcode VARCHAR(MAX)='1'
SELECT * FROM #ProductTotals
SELECT * FROM #ProductTotals WHERE CHARINDEX(',' + CAST(ProductID AS VARCHAR(MAX)) + ',' , ',' + ISNULL(#prmclientcode,ProductID) + ',') > 0
Let me know if any concern.

use following code in order to separate your variable:
DECLARE
#T VARCHAR(100) = '1,2,3,23,342',
#I int = 1
;WITH x(I, num) AS (
SELECT 1, CHARINDEX(',',#T,#I)
UNION ALL
SELECT num+1,CHARINDEX(',',#T,num+1)
FROM x
WHERE num+1<LEN(#T)
AND num<>0
)
SELECT SUBSTRING(#T,I,CASE WHEN num=0 THEN LEN(#T)+1 ELSE num END -I)
FROM x

Use can use either table function or dynamic sql query, both options will work.
Let me know if you need more help

sql server stored proc dynamic select

I have a stored proc in the following format
create PROCEDURE [dbo].[test proc]
#identifier varchar(20),
#issuerName varchar(max),
#max_records int=1000
AS
BEGIN
declare #select nvarchar(30)
SELECT #identifier as '#identifier'
, (
SELECT
MoodysOrgID as '#MoodysOrgID'
,ReportDate as '#ReportDate'
,m.UpdateTime as '#UpdateTime'
,m.FileCreationDate as '#FileCreationDate'
from mfm_financial_ratios m
inner join mfm_financial_ratios_coa c on c.AcctNo = m.AcctNo
where ReportDate in (select distinct top (#max_records) reportdate from mfm_financial_ratios where MoodysOrgID = m.MoodysOrgID)
and m.MoodysOrgID=(select top 1 IssuerID_Moodys as id from loans where LIN=#identifier or LoanXID=#identifier
and ParentName_Moodys=#issuerName and IssuerID_Moodys is not null)
order by ReportDate desc
FOR XML PATH('FinRatios'), TYPE
)
FOR XML PATH('FinRatiosHistory')
END
but i would like to make by query execute as dynamic sql
and my stored proc looks like
create PROCEDURE [dbo].[test proc]
#identifier varchar(20),
#issuerName varchar(max),
#max_records int=1000
AS
BEGIN
declare #select nvarchar(30)
set #select = N'SELECT #identifier as '#identifier'
, (
SELECT
MoodysOrgID as '#MoodysOrgID'
,ReportDate as '#ReportDate'
,m.UpdateTime as '#UpdateTime'
,m.FileCreationDate as '#FileCreationDate'
from mfm_financial_ratios m
inner join mfm_financial_ratios_coa c on c.AcctNo = m.AcctNo
where ReportDate in (select distinct top (#max_records) reportdate from mfm_financial_ratios where MoodysOrgID = m.MoodysOrgID)
and m.MoodysOrgID=(select top 1 IssuerID_Moodys as id from loans where LIN=#identifier or LoanXID=#identifier
and ParentName_Moodys=#issuerName and IssuerID_Moodys is not null)
order by ReportDate desc
FOR XML PATH('FinRatios'), TYPE
)
FOR XML PATH('FinRatiosHistory')'
exec #select
END
The following stored proc gives issues because of the comma used in it .Can someone let me know what you be the correct way of doing it

The problem are not the commas. You mostly have two problems: one, you're not escaping the quotes correctly. And two, you're not concatenating your variables correctly. Here's an example of both:
For concatenating variables: In your first select line, you cannot do this:
SELECT #identifier as '#identifier'
because sql does not know what to do with #identifier that way. You should concatenate the variable this way:
SELECT #identifier as ' + #identifier + '.. everything else goes here
Also, when you will have to concatenate max_records, since it's an int variable you should cast it to varchar first, like this:
select distinct top (' + cast(#max_records as varchar(10) + ') ....
Whenever you're using a variable in the middle of the string (such as #max_records) you HAVE to concatenate it in order for SQL to know it's a variable and not just a string. You didn't do it with max_records, #issuerName, etc.
For escaping quotes: You need to escape your single quotes when you don't want your select string to unexpectedly end. For example here:
FOR XML PATH('FinRatiosHistory')'
You should escape them with double quotes (google escaping single quotes sql if you don't get it)
FOR XML PATH(''FinRatiosHistory'')'

SQL Server split CSV into multiple rows

I realize this question has been asked before, but I can't get it to work for some reason.
I'm using the split function from this SQL Team thread (second post) and the following queries.
--This query converts the interests field from text to varchar
select
cp.id
,cast(cp.interests as varchar(100)) as interests
into #client_profile_temp
from
client_profile cp
--This query is supposed to split the csv ("Golf","food") into multiple rows
select
cpt.id
,split.data
from
#client_profile_temp cpt
cross apply dbo.split(
cpt.interests, ',') as split <--Error is on this line
However I'm getting an
Incorrect syntax near '.'
error where I've marked above.
In the end, I want
ID INTERESTS
000CT00002UA "Golf","food"
to be
ID INTERESTS
000CT00002UA "Golf"
000CT00002UA "food"
I'm using SQL Server 2008 and basing my answer on this StackOverflow question. I'm fairly new to SQL so any other words of wisdom would be appreciated as well.

TABLE
x-----------------x--------------------x
| ID | INTERESTS |
x-----------------x--------------------x
| 000CT00002UA | Golf,food |
| 000CT12303CB | Cricket,Bat |
x------x----------x--------------------x
METHOD 1 : Using XML format
SELECT ID,Split.a.value('.', 'VARCHAR(100)') 'INTERESTS'
FROM
(
-- To change ',' to any other delimeter, just change ',' before '</M><M>' to your desired one
SELECT ID, CAST ('<M>' + REPLACE(INTERESTS, ',', '</M><M>') + '</M>' AS XML) AS Data
FROM TEMP
) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)
SQL FIDDLE
METHOD 2 : Using function dbo.Split
SELECT a.ID, b.items
FROM #TEMP a
CROSS APPLY dbo.Split(a.INTERESTS, ',') b
SQL FIDDLE
And dbo.Split function is here.
CREATE FUNCTION [dbo].[Split](#String varchar(8000), #Delimiter char(1))
returns #temptable TABLE (items varchar(8000))
as
begin
declare #idx int
declare #slice varchar(8000)
select #idx = 1
if len(#String)<1 or #String is null return
while #idx!= 0
begin
set #idx = charindex(#Delimiter,#String)
if #idx!=0
set #slice = left(#String,#idx - 1)
else
set #slice = #String
if(len(#slice)>0)
insert into #temptable(Items) values(#slice)
set #String = right(#String,len(#String) - #idx)
if len(#String) = 0 break
end
return
end
FINAL RESULT

from
#client_profile_temp cpt
cross apply dbo.split(
#client_profile_temp.interests, ',') as split <--Error is on this line
I think the explicit naming of #client_profile_temp after you gave it an alias is a problem, try making that last line:
cpt.interests, ',') as split <--Error is on this line
EDIT You say
I made this change and it didn't change anything
Try pasting the code below (into a new SSMS window)
create table #client_profile_temp
(id int,
interests varchar(500))
insert into #client_profile_temp
values
(5, 'Vodka,Potassium,Trigo'),
(6, 'Mazda,Boeing,Alcoa')
select
cpt.id
,split.data
from
#client_profile_temp cpt
cross apply dbo.split(cpt.interests, ',') as split
See if it works as you expect; I'm using sql server 2008 and that works for me to get the kind of results I think you want.
Any chance when you say "I made the change", you just changed a stored procedure but haven't run it, or changed a script that creates a stored procedure, and haven't run that, something along those lines? As I say, it seems to work for me.

As this is old, it seems the following works in SQL Azure (as of 3/2022)
The big changes being split.value instead of .data or .items as shown above; no as after the function, and lastly string_split is the method.
select Id, split.value
from #reportTmp03 rpt
cross apply string_split(SelectedProductIds, ',') split

Try this:
--This query is supposed to split the csv ("Golf","food") into multiple rows
select
cpt.id
,split.data
from
#client_profile_temp cpt
cross apply dbo.split(cpt.interests, ',') as split <--Error is on this line
You must use table alias instead of table name as soon as you define it.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Not able to parse JSON in synapse SQL with OPENJSON - sql-server

Related

SQL Server 2016 extract info from XML

Transact SQL Pivot a Split String into appropriate columns

Converting nvarchar to numeric

sql server stored proc dynamic select

SQL Server split CSV into multiple rows

Categories

Resources