How to create a virtual SQL Server table from a csv file - sql-server

I have multiple use cases whereby I need to effectively create a virtual sql server table from a csv source and I need to do it preferably using pure sql on the fly- ie not using a procedure. To provide some context, I need to wire up this code in a third party reporting engine; which is talking to sql server in addition to other multiple data sources such as oracle etc.
This can be done in POSTGRE using a simple function split_part because that function includes the ability to search the csv concate string for the position of the field delimiter; example ',' comma separator. Unfortunately, sql server has a similar function STRING_SPLIT ( string , separator ) but notice it does NOT have the position aspect SPLIT_PART(string, delimiter, position) offered in POSTGre.
Example source csv:
Row 1 a,b,c,d
Row 2 e,f,g,h
etc
Output - using POSTGre db but require same for sql server
select split_part(p.Lines, ',',1) As Col1,
split_part(p.Lines, ',',2) As Col2,
split_part(p.Lines, ',',3) As Col3,
split_part(p.Lines, ',',4) As Col4
from ( select unnest(string_to_array(:csvdata, chr(10))) as Lines)p
Any solution?

SQL Server allows to query *.csv file as a virtual DB table on a file system.
Here is a minimal reproducible example.
SQL
SELECT *
FROM OPENROWSET(BULK 'e:\Temp\Quark.csv'
, FORMATFILE = 'e:\Temp\Quark.xml'
, ERRORFILE = 'e:\Temp\Quark.err'
, FIRSTROW = 2 -- real data starts on the 2nd row
, MAXERRORS = 100
) AS tbl;
Quark.csv
"ID"|"Name"|"Color"|"LogDate"|"Unknown"
41|Orange|Orange|2018-09-09 16:41:02.000|
42|Cherry, Banana|Red,Yellow||
43|Apple|Yellow|2017-09-09 16:41:02.000|
Quark.xml
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR='|' MAX_LENGTH="70"/>
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR='|' MAX_LENGTH="70" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR='|' MAX_LENGTH="70" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR='|' MAX_LENGTH="70" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="5" xsi:type="CharTerm" TERMINATOR='\r\n' MAX_LENGTH="70" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="ID" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="2" NAME="Name" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="3" NAME="Color" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="4" NAME="LogDate" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="5" NAME="Unknown" xsi:type="SQLVARYCHAR"/>
</ROW>
</BCPFORMAT>
Output
+----+----------------+------------+-------------------------+---------+
| ID | Name | Color | LogDate | Unknown |
+----+----------------+------------+-------------------------+---------+
| 41 | Orange | Orange | 2018-09-09 16:41:02.000 | NULL |
| 42 | Cherry, Banana | Red,Yellow | NULL | NULL |
| 43 | Apple | Yellow | 2017-09-09 16:41:02.000 | NULL |
+----+----------------+------------+-------------------------+---------+

Unless I am mistaken - split_part parses a delimited string into component elements and returns the desired element from that string.
To accomplish this in SQL Server - we can use charindex and substring, or you can use a combination of splitting the string to a table, and rolling the values back up in a GROUP BY.
Here is a function that can be used to return up to 12 elements from a delimited string:
ALTER Function [dbo].[fnSplitString_12Columns] (
#pString varchar(8000)
, #pDelimiter char(1)
)
Returns Table
With schemabinding
As
Return
Select InputString = #pString
, p01_pos = p01.pos
, p02_pos = p02.pos
, p03_pos = p03.pos
, p04_pos = p04.pos
, p05_pos = p05.pos
, p06_pos = p06.pos
, p07_pos = p07.pos
, p08_pos = p08.pos
, p09_pos = p09.pos
, p10_pos = p10.pos
, p11_pos = p11.pos
, p12_pos = p12.pos
, col_01 = ltrim(substring(v.inputString, 1, p01.pos - 2))
, col_02 = ltrim(substring(v.inputString, p01.pos, p02.pos - p01.pos - 1))
, col_03 = ltrim(substring(v.inputString, p02.pos, p03.pos - p02.pos - 1))
, col_04 = ltrim(substring(v.inputString, p03.pos, p04.pos - p03.pos - 1))
, col_05 = ltrim(substring(v.inputString, p04.pos, p05.pos - p04.pos - 1))
, col_06 = ltrim(substring(v.inputString, p05.pos, p06.pos - p05.pos - 1))
, col_07 = ltrim(substring(v.inputString, p06.pos, p07.pos - p06.pos - 1))
, col_08 = ltrim(substring(v.inputString, p07.pos, p08.pos - p07.pos - 1))
, col_09 = ltrim(substring(v.inputString, p08.pos, p09.pos - p08.pos - 1))
, col_10 = ltrim(substring(v.inputString, p09.pos, p10.pos - p09.pos - 1))
, col_11 = ltrim(substring(v.inputString, p10.pos, p11.pos - p10.pos - 1))
, col_12 = ltrim(substring(v.inputString, p11.pos, p12.pos - p11.pos - 1))
From (Values (concat(#pString, replicate(#pDelimiter, 12)))) As v(inputString)
Cross Apply (Values (charindex(#pDelimiter, v.inputString, 1) + 1)) As p01(pos)
Cross Apply (Values (charindex(#pDelimiter, v.inputString, p01.pos) + 1)) As p02(pos)
Cross Apply (Values (charindex(#pDelimiter, v.inputString, p02.pos) + 1)) As p03(pos)
Cross Apply (Values (charindex(#pDelimiter, v.inputString, p03.pos) + 1)) As p04(pos)
Cross Apply (Values (charindex(#pDelimiter, v.inputString, p04.pos) + 1)) As p05(pos)
Cross Apply (Values (charindex(#pDelimiter, v.inputString, p05.pos) + 1)) As p06(pos)
Cross Apply (Values (charindex(#pDelimiter, v.inputString, p06.pos) + 1)) As p07(pos)
Cross Apply (Values (charindex(#pDelimiter, v.inputString, p07.pos) + 1)) As p08(pos)
Cross Apply (Values (charindex(#pDelimiter, v.inputString, p08.pos) + 1)) As p09(pos)
Cross Apply (Values (charindex(#pDelimiter, v.inputString, p09.pos) + 1)) As p10(pos)
Cross Apply (Values (charindex(#pDelimiter, v.inputString, p10.pos) + 1)) As p11(pos)
Cross Apply (Values (charindex(#pDelimiter, v.inputString, p11.pos) + 1)) As p12(pos);
The function also returns the starting position in the string for each element. If you use the function to return only the first 4 elements - the rest of the code will be factored out of the final query.
If you have more than 12 columns you can extend to as many as needed. Performance may be affected by the size of the string (8000 vs MAX) and how many elements are parsed - but the only way to be sure will be to test different methods to see which performs best for your data.

Related

For XML Path with nesting elements based on level after recursive CTE

I have data that looks like after writing the recursive CTE:
| EPC | ParentEPC | SerialEventId| Level| #ItemName|
|--------|--------------|--------------:|:----------:|--------------|
| a| NULL|5557|0|[PALLET] - 7 UNITS|
| b| a|5557|1|[CARTON] - 1 UNIT|
| c| a|5557|1|[CASE] - 3 UNITS|
| d| c|5557|2|[CARTON] - 1 UNIT|
| e| c|5557|2|[CARTON] - 1 UNIT|
| f| c|5557|2|[CARTON] - 1 UNIT|
I want to write a T-SQL query in SQL Server to return the data like this:
<Items>
<Item ItemName="[PALLET] - 7 UNITS">
<Item ItemName="[CARTON] - 1 UNIT" />
<Item ItemName="[CASE] - 3 UNITS">
<Item ItemName="[CARTON] - 1 UNIT" />
<Item ItemName="[CARTON] - 1 UNIT" />
<Item ItemName="[CARTON] - 1 UNIT" />
</Item>
</Item>
</Items>
I have tried XML PATH but couldn't able to get the nesting part in XML based on the level like below
SELECT *
,(
SELECT epc."#ItemName"
FROM #EPC_items epc
WHERE epc.SerialEventID = se.SerialEventID
FOR XML PATH('Item')
,ROOT('Items')
,TYPE
)
FROM #SerialEvents se
Here is the recursive CTE query that I used to get the result table shown above
IF OBJECT_ID('tempdb..#SerialEvents') IS NOT NULL DROP TABLE #SerialEvents
GO
IF OBJECT_ID('tempdb..#EPC_items') IS NOT NULL DROP TABLE #EPC_items
GO
SELECT DISTINCT se.SerialEventID, se.OrderType, se.ASWRefNum, se.ASWLineNum
into #SerialEvents
FROM dbo.SerialEvent se
WHERE 1=1
AND SerialEventTypeId not in (1,13,8,9,10)
AND SerialEventDateTime >= DATEADD(d,-2, GETDATE())
ORDER BY 1 DESC;
;WITH CTE
as (
SELECT
et.EPC,et.ParentEPC,
et.SerialEventId ,
0 AS [Level],
' [' + UPPER(e.UnitType) + '] - '
+ CAST(e.ChildQuantity as VARCHAR)
+
CASE
WHEN e.ChildQuantity > 1 THEN ' UNITS'
ELSE ' UNIT'
END AS "#ItemName"
FROM dbo.EPCTRansaction et
INNER JOIN dbo.SerialEvent se on et.SerialEventId = se.SerialEventId
INNER JOIN dbo.vwEPC e ON et.EPC = e.EPC
INNER JOIN #SerialEvents sep on et.SerialEventId = sep.SerialEventId and et.SerialEventID =5557
WHERE
1=1
AND et.ParentEPC IS NULL
UNION ALL
SELECT
et.EPC,
CTE.EPC as ParentEPC,
et.SerialEventId,
cte.Level + 1,
' [' + UPPER(e.UnitType) + '] - '
+ CAST(e.ChildQuantity as VARCHAR)
+
CASE
WHEN e.ChildQuantity > 1 THEN ' UNITS'
ELSE ' UNIT'
END AS "#ItemName"
FROM dbo.EPCTRansaction et
INNER JOIN dbo.SerialEvent se on et.SerialEventId = se.SerialEventId
INNER JOIN dbo.vwEPC e ON et.EPC = e.EPC
INNER JOIN CTE on et.ParentEPC = CTE.EPC and et.SerialEventId = CTE.SerialEventId
WHERE
1=1
)
select * into #EPC_items from CTE
select * from #EPC_items
Recursing in rows is very easy in SQL Server. On the other hand, what you are trying to do is grouped recursion: on each level you want to group up the data and place it inside its parent. This is much harder.
The easiest method I have found is to use (horror of horrors!) a scalar UDF.
Unfortunately I can't test this as you haven't given proper sample data for all your tables. It's also unclear which joins are needed.
CREATE FUNCTION dbo.GetXml (#ParentEPC int)
RETURNS xml
AS
BEGIN
RETURN (
SELECT
CONCAT(
'[',
UPPER(e.UnitType),
'] - ',
e.ChildQuantity,
CASE
WHEN e.ChildQuantity > 1 THEN ' UNITS'
ELSE ' UNIT'
END
) AS [#ItemName],
dbo.GetXml(et.EPC) -- do not name this column
FROM dbo.EPCTRansaction et
INNER JOIN dbo.SerialEvent se on et.SerialEventId = se.SerialEventId
INNER JOIN dbo.vwEPC e ON et.EPC = e.EPC
INNER JOIN #SerialEvents sep on et.SerialEventId = sep.SerialEventId and et.SerialEventID = 5557
WHERE
EXISTS (SELECT et.ParentEPC INTERSECT SELECT #ParentEPC) -- nullable compare
FOR XML PATH('Item'), TYPE
);
END;
SELECT
dbo.GetXml(NULL)
FOR XML PATH('Items'), TYPE;

XML to SQL Table Query

This is my XML stored in a row. How do I convert it to insert into a table using a T-SQL query in the following table format?
<ENVELOPE>
<DSPVCHDATE>16-4-2021</DSPVCHDATE>
<DSPVCHITEMACCOUNT>PRASHANT MEHTA 359244</DSPVCHITEMACCOUNT>
<DSPVCHTYPE>Sale</DSPVCHTYPE>
<DSPINBLOCK>
<DSPVCHINQTY></DSPVCHINQTY>
<DSPVCHINAMT></DSPVCHINAMT>
</DSPINBLOCK>
<DSPOUTBLOCK>
<DSPVCHOUTQTY>1 Pcs</DSPVCHOUTQTY>
<DSPVCHNETTOUTAMT>23046.88</DSPVCHNETTOUTAMT>
</DSPOUTBLOCK>
<DSPCLBLOCK>
<DSPVCHCLQTY></DSPVCHCLQTY>
<DSPVCHCLAMT></DSPVCHCLAMT>
</DSPCLBLOCK>
<DSPEXPLVCHNUMBER>(No. :IV2612)</DSPEXPLVCHNUMBER>
<DSPVCHDATE>19-4-2021</DSPVCHDATE>
<DSPVCHITEMACCOUNT>XYZ Company</DSPVCHITEMACCOUNT>
<DSPVCHTYPE>Purchase</DSPVCHTYPE>
<DSPINBLOCK>
<DSPVCHINQTY>1 Pcs</DSPVCHINQTY>
<DSPVCHINAMT>23437.50</DSPVCHINAMT>
</DSPINBLOCK>
<DSPOUTBLOCK>
<DSPVCHOUTQTY></DSPVCHOUTQTY>
<DSPVCHNETTOUTAMT></DSPVCHNETTOUTAMT>
</DSPOUTBLOCK>
<DSPCLBLOCK>
<DSPVCHCLQTY>0 Pcs</DSPVCHCLQTY>
<DSPVCHCLAMT></DSPVCHCLAMT>
</DSPCLBLOCK>
<DSPEXPLVCHNUMBER>(No. :IV2613)</DSPEXPLVCHNUMBER>
</ENVELOPE>
This is the required output format.
Issue is I do not have a record separator in raw xml. Each new records starts with a <DSPVCHDATE>
Here is another method by using pure XQuery. No need to do any string manipulation, CASTing, etc.
All elements inside the root element <ENVELOPE> constitute an Arithmetic Progression. Elements that grouped by their position: 1 - 7, 8 - 14, etc. should be placed inside the encompassing <row> element.
It creates the following XML on the fly:
<ENVELOPE>
<row>
<DSPVCHDATE>16-4-2021</DSPVCHDATE>
...
<DSPEXPLVCHNUMBER>(No. :IV2612)</DSPEXPLVCHNUMBER>
</row>
<row>
<DSPVCHDATE>19-4-2021</DSPVCHDATE>
...
<DSPEXPLVCHNUMBER>(No. :IV2613)</DSPEXPLVCHNUMBER>
</row>
</ENVELOPE>
SQL
DECLARE #tbl TABLE (ID INT IDENTITY PRIMARY KEY, xmldata XML);
INSERT INTO #tbl (xmldata) VALUES
(N'<ENVELOPE>
<DSPVCHDATE>16-4-2021</DSPVCHDATE>
<DSPVCHITEMACCOUNT>PRASHANT MEHTA 359244</DSPVCHITEMACCOUNT>
<DSPVCHTYPE>Sale</DSPVCHTYPE>
<DSPINBLOCK>
<DSPVCHINQTY></DSPVCHINQTY>
<DSPVCHINAMT></DSPVCHINAMT>
</DSPINBLOCK>
<DSPOUTBLOCK>
<DSPVCHOUTQTY>1 Pcs</DSPVCHOUTQTY>
<DSPVCHNETTOUTAMT>23046.88</DSPVCHNETTOUTAMT>
</DSPOUTBLOCK>
<DSPCLBLOCK>
<DSPVCHCLQTY></DSPVCHCLQTY>
<DSPVCHCLAMT></DSPVCHCLAMT>
</DSPCLBLOCK>
<DSPEXPLVCHNUMBER>(No. :IV2612)</DSPEXPLVCHNUMBER>
<DSPVCHDATE>19-4-2021</DSPVCHDATE>
<DSPVCHITEMACCOUNT>XYZ Company</DSPVCHITEMACCOUNT>
<DSPVCHTYPE>Purchase</DSPVCHTYPE>
<DSPINBLOCK>
<DSPVCHINQTY>1 Pcs</DSPVCHINQTY>
<DSPVCHINAMT>23437.50</DSPVCHINAMT>
</DSPINBLOCK>
<DSPOUTBLOCK>
<DSPVCHOUTQTY></DSPVCHOUTQTY>
<DSPVCHNETTOUTAMT></DSPVCHNETTOUTAMT>
</DSPOUTBLOCK>
<DSPCLBLOCK>
<DSPVCHCLQTY>0 Pcs</DSPVCHCLQTY>
<DSPVCHCLAMT></DSPVCHCLAMT>
</DSPCLBLOCK>
<DSPEXPLVCHNUMBER>(No. :IV2613)</DSPEXPLVCHNUMBER>
</ENVELOPE>');
SELECT ID --, x
, c.value('(DSPVCHDATE/text())[1]','nvarchar(100)') as DSPVCHDATE
,c.value('(DSPVCHITEMACCOUNT/text())[1]','nvarchar(100)') as DSPVCHITEMACCOUNT
,c.value('(DSPVCHTYPE/text())[1]','nvarchar(100)') as DSPVCHTYPE
,c.value('(DSPINBLOCK/DSPVCHINQTY/text())[1]','nvarchar(100)') AS DSPVCHINQTY
,c.value('(DSPINBLOCK/DSPVCHINAMT/text())[1]','decimal(12,2)') AS DSPVCHINAMT
,c.value('(DSPOUTBLOCK/DSPVCHOUTQTY/text())[1]','nvarchar(100)') AS DSPVCHOUTQTY
,c.value('(DSPOUTBLOCK/DSPVCHNETTOUTAMT/text())[1]','decimal(12,2)') AS DSPVCHNETTOUTAMT
,c.value('(DSPEXPLVCHNUMBER/text())[1]','nvarchar(100)') as DSPEXPLVCHNUMBER
--,c.value('(DSPCLBLOCK/DSPVCHCLQTY/text())[1]','nvarchar(100)') AS DSPVCHCLQTY
--,c.value('(DSPCLBLOCK/DSPVCHCLAMT/text())[1]','int') AS DSPVCHCLAMT
FROM #tbl
CROSS APPLY (SELECT xmldata.query('<ENVELOPE>
{
for $x in /ENVELOPE/DSPVCHDATE
let $pos := count(ENVELOPE/DSPVCHDATE[. << $x]) + 1
let $start := 1 + 7 * ($pos -1)
let $end := 7 * $pos
return <row>{/ENVELOPE/*[position() ge $start and position() le $end]}</row>
}
</ENVELOPE>')) AS t1(x)
CROSS APPLY t1.x.nodes('/ENVELOPE/row') AS t2(c);
Output
+----+------------+-----------------------+------------+-------------+-------------+--------------+------------------+------------------+
| ID | DSPVCHDATE | DSPVCHITEMACCOUNT | DSPVCHTYPE | DSPVCHINQTY | DSPVCHINAMT | DSPVCHOUTQTY | DSPVCHNETTOUTAMT | DSPEXPLVCHNUMBER |
+----+------------+-----------------------+------------+-------------+-------------+--------------+------------------+------------------+
| 1 | 16-4-2021 | PRASHANT MEHTA 359244 | Sale | NULL | NULL | 1 Pcs | 23046.88 | (No. :IV2612) |
| 1 | 19-4-2021 | XYZ Company | Purchase | 1 Pcs | 23437.50 | NULL | NULL | (No. :IV2613) |
+----+------------+-----------------------+------------+-------------+-------------+--------------+------------------+------------------+
SQL #2
Based on #Charlieface idea.
WITH rs AS
(
SELECT ID, xmldata
, c.value('for $i in . return count(../*[. << $i]) + 1', 'INT') AS pos
FROM #tbl
CROSS APPLY xmldata.nodes('/ENVELOPE/DSPVCHDATE') AS t(c)
)
SELECT ID
, c.value('(/ENVELOPE/*[sql:column("pos")]/text())[1]','nvarchar(100)') AS DSPVCHDATE
, c.value('(/ENVELOPE/*[sql:column("pos") + 1]/text())[1]','nvarchar(100)') AS DSPVCHITEMACCOUNT
, c.value('(/ENVELOPE/*[sql:column("pos") + 2]/text())[1]','nvarchar(100)') AS DSPVCHTYPE
, c.value('(/ENVELOPE/*[sql:column("pos") + 3]/DSPVCHINQTY/text())[1]','nvarchar(100)') AS DSPVCHINQTY
, c.value('(/ENVELOPE/*[sql:column("pos") + 3]/DSPVCHINAMT/text())[1]','decimal(12,2)') AS DSPVCHINAMT
, c.value('(/ENVELOPE/*[sql:column("pos") + 4]/DSPVCHOUTQTY/text())[1]','nvarchar(100)') AS DSPVCHOUTQTY
, c.value('(/ENVELOPE/*[sql:column("pos") + 4]/DSPVCHNETTOUTAMT/text())[1]','nvarchar(100)') AS DSPVCHNETTOUTAMT
, c.value('(/ENVELOPE/*[sql:column("pos") + 6]/text())[1]','nvarchar(100)') AS DSPEXPLVCHNUMBER
FROM rs
CROSS APPLY xmldata.nodes('/ENVELOPE') AS t(c);
You can use outer apply to navigate the nested elements of xml content.
Given the inconvenient structure of this XML, it can be changed into something useable as follows, by adding a containing node called <ThisNode>.
DECLARE #XML XML = '
<ENVELOPE>
<DSPVCHDATE>16-4-2021</DSPVCHDATE>
<DSPVCHITEMACCOUNT>PRASHANT MEHTA 359244</DSPVCHITEMACCOUNT>
<DSPVCHTYPE>Sale</DSPVCHTYPE>
<DSPINBLOCK>
<DSPVCHINQTY></DSPVCHINQTY>
<DSPVCHINAMT></DSPVCHINAMT>
</DSPINBLOCK>
<DSPOUTBLOCK>
<DSPVCHOUTQTY>1 Pcs</DSPVCHOUTQTY>
<DSPVCHNETTOUTAMT>23046.88</DSPVCHNETTOUTAMT>
</DSPOUTBLOCK>
<DSPCLBLOCK>
<DSPVCHCLQTY></DSPVCHCLQTY>
<DSPVCHCLAMT></DSPVCHCLAMT>
</DSPCLBLOCK>
<DSPEXPLVCHNUMBER>(No. :IV2612)</DSPEXPLVCHNUMBER>
<DSPVCHDATE>19-4-2021</DSPVCHDATE>
<DSPVCHITEMACCOUNT>XYZ Company</DSPVCHITEMACCOUNT>
<DSPVCHTYPE>Purchase</DSPVCHTYPE>
<DSPINBLOCK>
<DSPVCHINQTY>1 Pcs</DSPVCHINQTY>
<DSPVCHINAMT>23437.50</DSPVCHINAMT>
</DSPINBLOCK>
<DSPOUTBLOCK>
<DSPVCHOUTQTY></DSPVCHOUTQTY>
<DSPVCHNETTOUTAMT></DSPVCHNETTOUTAMT>
</DSPOUTBLOCK>
<DSPCLBLOCK>
<DSPVCHCLQTY>0 Pcs</DSPVCHCLQTY>
<DSPVCHCLAMT></DSPVCHCLAMT>
</DSPCLBLOCK>
<DSPEXPLVCHNUMBER>(No. :IV2613)</DSPEXPLVCHNUMBER>
</ENVELOPE>'
This can be converted to useable XML as follows:
WITH
cte AS (Select REPLACE(REPLACE(CONVERT(NVARCHAR(MAX), #XML, 1), N'<DSPVCHDATE>', '
</ThisNode>
<ThisNode>
<DSPVCHDATE>'), N'</ENVELOPE>', N'
</ThisNode>
</ENVELOPE>') AS str)
SELECT #XML = CAST(STUFF(str, CHARINDEX(N'</ThisNode>', str), LEN(N'</ThisNode>'), N'') AS XML)
FROM cte
;
query
SELECT
A.evnt.value('(DSPVCHDATE/text())[1]','nvarchar(100)') as DSPVCHDATE
,A.evnt.value('(DSPVCHITEMACCOUNT/text())[1]','nvarchar(100)') as DSPVCHITEMACCOUNT
,A.evnt.value('(DSPVCHTYPE/text())[1]','nvarchar(100)') as DSPVCHTYPE
,A.evnt.value('(DSPVCHITEMACCOUNT/text())[1]','nvarchar(100)') as DSPVCHITEMACCOUNT
,A.evnt.value('(DSPEXPLVCHNUMBER/text())[1]','nvarchar(100)') as DSPEXPLVCHNUMBER
,B.rec.value('(DSPVCHINQTY/text())[1]','nvarchar(100)') AS DSPVCHINQTY
,B.rec.value('(DSPVCHINAMT/text())[1]','nvarchar(100)') AS DSPVCHINAMT
,C.rec.value('(DSPVCHOUTQTY/text())[1]','nvarchar(100)') AS DSPVCHOUTQTY
,C.rec.value('(DSPVCHNETTOUTAMT/text())[1]','float') AS DSPVCHNETTOUTAMT
,D.rec.value('(DSPVCHCLQTY/text())[1]','nvarchar(100)') AS DSPVCHCLQTY
,D.rec.value('(DSPVCHCLAMT/text())[1]','int') AS DSPVCHCLAMT
FROM #XML.nodes('/ENVELOPE/ThisNode') A(evnt)
OUTER APPLY A.evnt.nodes('DSPINBLOCK') B(rec)
OUTER APPLY A.evnt.nodes('DSPOUTBLOCK') C(rec)
OUTER APPLY A.evnt.nodes('DSPCLBLOCK') D(rec)
demo in db<>fiddle

Parse Columns in SQL by delimiter

In SQL Server, I have a table/view that has multiple columns. The last column looks like this:
COL
---------------------------------
|test|test|test11|testing|final
|test|test|test1|testing2|final3
|test|test|test17|testing|final6
How do parse this column by | and combine it with the right side of the existing table like such:
COL1 COL2 COL Parse1 Parse2 Parse3 Parse4 Parse5
1 4 |test|test|test11|testing|final test test test11 testing final
2 6 |test|test|test1|testing2|final3 test test test1 testing2 final3
5 9 |test|test|test17|testing|final6 test test test17 testing final6
There are the same number of parsings for column COL.
Any help would be great thanks!
Not clear if you have a leading | in the field COL. If so, you may want to shift /x[n]
The pattern is pretty clear. Easy to expand or contract as necessary
Example
Declare #YourTable Table ([COL] varchar(50))
Insert Into #YourTable Values
('test|test|test11|testing|final')
,('test|test|test1|testing2|final3')
,('test|test|test17|testing|final6')
Select A.*
,B.*
From #YourTable A
Cross Apply (
Select Pos1 = ltrim(rtrim(xDim.value('/x[1]','varchar(max)')))
,Pos2 = ltrim(rtrim(xDim.value('/x[2]','varchar(max)')))
,Pos3 = ltrim(rtrim(xDim.value('/x[3]','varchar(max)')))
,Pos4 = ltrim(rtrim(xDim.value('/x[4]','varchar(max)')))
,Pos5 = ltrim(rtrim(xDim.value('/x[5]','varchar(max)')))
,Pos6 = ltrim(rtrim(xDim.value('/x[6]','varchar(max)')))
,Pos7 = ltrim(rtrim(xDim.value('/x[7]','varchar(max)')))
,Pos8 = ltrim(rtrim(xDim.value('/x[8]','varchar(max)')))
,Pos9 = ltrim(rtrim(xDim.value('/x[9]','varchar(max)')))
From (Select Cast('<x>' + replace((Select replace(A.Col,'|','§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml) as xDim) as B1
) B
Returns

Issue in complicated join

I have 4 tables
tbLicenceTypesX (2 Fields)
LicenceTypes
LicenceTypesX
tbLicenceTypesX (Contains data like)
1 - Medical Licence
2 - Property
3 - Casualty
4 - Trainning Licence
tbProduct (3 feilds)
Product
ProductX
CompanyId (F.K)
LicenceTypes(F.K)
tbProduct (Contains data like)
1 - T.V - 10 - 2
2 - A.C - 30 - 3
3 - Mobiles - 40 -4
tbLicence (3 feilds)
Licence
LicenceTypesNames
AgentId
tbLicence (Contains data like)
1 - Property, Casualty - 23
2 - Trainning Licence, Casualty - 34
Now I have to Fetch Product and ProductX from tbProduct whose LicenceTypes matches with Agent's Licence in tbLicence in a Company.
For e.g: I have to fetch T.V Whose Licence Types is 2("Property") and Company Id is 10 which should be assigned to Agent where Agent Id is 23 and Whose LicenceTypesNames should also contains "Property"
I want to fetch something like
#CompanyId int,
#AgentId int
As
SELECT p.ProductX,p.Product
from tbProduct p
inner join tbLicence l on p.LicenceTypes = l.LicenceTypesNames<its corresponding Id>
inner join tbProduct c on c.Product =p.Product
where
c.CompanyId=#CompanyId
and l.AgentId=#AgentId
Please help me!!!
You can use XML and CROSS APPLY to Split the comma separated values and JOIN with tbProduct. The LTRIM and RTRIM functions are used to trim the comma separated values if they have excessive empty space. The below code gives you the desired output.
DECLARE #CompanyId int = 30, #AgentId int = 23
;WITH CTE AS
(
SELECT AgentId, TCT.LicenceTypes FROM
(
SELECT AgentId, LTRIM(RTRIM(Split.XMLData.value('.', 'VARCHAR(100)'))) LicenceTypesNames FROM
(
SELECT AgentID, Cast ('<M>' + REPLACE(LicenceTypesNames, ',', '</M><M>') + '</M>' AS XML) AS Data
FROM tbLicence
) AS XMLData
CROSS APPLY Data.nodes ('/M') AS Split(XMLData)
)
AS LTN
JOIN tbLicenceTypesX TCT ON LTN.LicenceTypesNames = tct.LicenceTypesX
)
SELECT p.ProductX,p.Product
FROM tbProduct P
JOIN CTE c on p.LicenceTypes = c.LicenceTypes
WHERE CompanyId = #CompanyId
AND AgentId = #AgentId
Sql Fiddle Demo

SQL and XML - Calculation of longest continuous pause

I have a very specific problem which i was hoping somebody could shed some light on. It is not exactly an error but more so help on the query i need to run to return the desired result set.
I have a table called xml_table with 2 columns; word_id, word_data:
word_id | word_data
1 | <results><channel id="1"><r s="0" d="650" w="Hello"/><r s="650" d="230" w="SIL"/></channel></results>
2 | <results><channel id="1"><r s="0" d="350" w="Sorry"/><r s="350" d="10" w="WHO"/></channel></results>
3 | <results><channel id="1"><r s="0" d="750" w="Please"/><r s="750" d="50" w="s"/></channel></results>
...
and so on where word_data is an XML String.
The XML String within each row is of the following format:
<results>
<channel id="1">
<r s="0" d="100" w="SIL"/>
<r s="100" d="250" w="Sorry"/>
<r s="350" d="100" w="WHO"/>
<r s="450" d="350" w="SIL"/>
<r s="800" d="550" w="SIL"/>
<r s="1350" d="100" w="Hello"/>
<r s="1450" d="200" w="s"/>
<r s="1650" d="50" w="SIL"/>
<r s="1700" d="100" w="SIL"/>
</channel>
</results>
s represents start time
d represents duration
w represents word
(the number of r tag is NOT fixed and changes from row to row of xml_table)
The idea now is to sift through each row, and within each XML, calculate the longest consecutive duration when a 'SIL' or 's' appears as a in the w attribute and then to return this in a new table as longest_pause (i.e longest consecutive SIL/s duration) with word_id and word_data also.
So in the above example xml we have three consecutive periods where the longest_pause can occur where the total durations are 100 (100), 900 (350+550) and 350 (200 + 50 + 100) and therefore the longest_pause is 900 so 900 would be returned.
I was wondering if anybody could help with this, so far i have:
DECLARE #xml XML
DECLARE #ordered_table TABLE (id VARCHAR(20) NOT NULL, start_time INT NOT NULL, duration INT NOT NULL, word VARCHAR(50) NOT NULL)
SELECT #xml = (SELECT word_data FROM xml_table where word_id = 1)
INSERT into #ordered_table_by_time(id, start_time, duration, word)
SELECT 'NAME' AS id, Tbl.Col.value('#s', 'INT'), Tbl.Col.value('#d', 'INT'), Tbl.Col.value('#w', 'varchar(50)') FROM #xml.nodes('/results/channel[#id="1"]/r') Tbl(Col)
i.e, I have created a table to put the XML into, but i do not know where to go from there,
Please can somebody help?
Thank you :)
Your attempt at solving this looks like you want to find the longest duration for one XML but the text suggests that you want to find the row in xml_table that has the longest duration.
Working with the one XML instance and modified version of your table variable you could do like this.
DECLARE #xml XML = '
<results>
<channel id="1">
<r s="0" d="100" w="SIL"/>
<r s="100" d="250" w="Sorry"/>
<r s="350" d="100" w="WHO"/>
<r s="450" d="350" w="SIL"/>
<r s="800" d="550" w="SIL"/>
<r s="1350" d="100" w="Hello"/>
<r s="1450" d="200" w="s"/>
<r s="1650" d="50" w="SIL"/>
<r s="1700" d="100" w="SIL"/>
</channel>
</results>';
DECLARE #ordered_table TABLE
(
id INT NOT NULL,
start_time INT NOT NULL,
duration INT NOT NULL,
word VARCHAR(50) NOT NULL
);
INSERT INTO #ordered_table(id, start_time, duration, word)
SELECT row_number() over(order by Tbl.Col.value('#s', 'INT')),
Tbl.Col.value('#s', 'INT'),
Tbl.Col.value('#d', 'INT'),
Tbl.Col.value('#w', 'varchar(50)')
FROM #xml.nodes('/results/channel[#id="1"]/r') Tbl(Col);
WITH C AS
(
SELECT T.id,
CASE WHEN T.word IN ('S', 'SIL') THEN T.duration ELSE 0 END AS Dur
FROM #ordered_table as T
WHERE T.ID = 1
UNION ALL
SELECT T.id,
CASE WHEN T.word IN ('S', 'SIL') THEN C.Dur + T.duration ELSE 0 END AS Dur
FROM #ordered_table as T
INNER JOIN C
ON T.ID = C.ID + 1
)
SELECT TOP(1) *
FROM C
ORDER BY C.Dur DESC;
SQL Fiddle
I added a ID field that is used in a recursive CTE to walk through the nodes and calculating a running sum where w is SIL or s. Then fetching the longest duration from the CTE using TOP(1) ... ORDER BY.
If you instead want the row in xml_table with the longest duration you can do like this.
with C as
(
select 1 as node,
X.word_id,
X.word_data,
case when T.W in ('S', 'SIL') then T.D else 0 end as duration
from dbo.xml_table as X
cross apply (select X.word_data.value('(/results/channel[#id = "1"]/r/#d)[1]', 'int'),
X.word_data.value('(/results/channel[#id = "1"]/r/#w)[1]', 'nvarchar(100)')) as T(D, W)
union all
select C.node + 1,
X.word_id,
X.word_data,
case when T.W in ('S', 'SIL') then T.D + C.duration else 0 end as duration
from C
inner join dbo.xml_table as X
on X.word_id = C.word_id
cross apply (select X.word_data.value('(/results/channel[#id = "1"]/r/#d)[sql:column("C.Node")+1][1]', 'int'),
X.word_data.value('(/results/channel[#id = "1"]/r/#w)[sql:column("C.Node")+1][1]', 'nvarchar(100)')) as T(D, W)
where T.W is not null
)
select T.word_id,
T.word_data,
T.duration
from
(
select row_number() over(partition by C.word_id order by C.duration desc) as rn,
C.word_id,
C.word_data,
C.duration
from C
) as T
where T.rn = 1
option (maxrecursion 0);
SQL Fiddle
The recursive CTE part works the same as before but but for multiple rows at the same time and it is getting the value for duration from the XML directly using the column node that is incremented for each iteration. The query against the CTE uses row_number() to find the longest duration for each row.
Have you considered using something like python instead?
You can query the SQL to get the data, then use regular expressions to extract the values from the XML, calculate the value wanted, then insert it back into the results table.
I recently did something slightly similar and decided doing the processing in python was a much easier way to do it if that's possible for you

Resources