SQL and XML - Calculation of longest continuous pause - sql-server

I have a very specific problem which i was hoping somebody could shed some light on. It is not exactly an error but more so help on the query i need to run to return the desired result set.
I have a table called xml_table with 2 columns; word_id, word_data:
word_id | word_data
1 | <results><channel id="1"><r s="0" d="650" w="Hello"/><r s="650" d="230" w="SIL"/></channel></results>
2 | <results><channel id="1"><r s="0" d="350" w="Sorry"/><r s="350" d="10" w="WHO"/></channel></results>
3 | <results><channel id="1"><r s="0" d="750" w="Please"/><r s="750" d="50" w="s"/></channel></results>
...
and so on where word_data is an XML String.
The XML String within each row is of the following format:
<results>
<channel id="1">
<r s="0" d="100" w="SIL"/>
<r s="100" d="250" w="Sorry"/>
<r s="350" d="100" w="WHO"/>
<r s="450" d="350" w="SIL"/>
<r s="800" d="550" w="SIL"/>
<r s="1350" d="100" w="Hello"/>
<r s="1450" d="200" w="s"/>
<r s="1650" d="50" w="SIL"/>
<r s="1700" d="100" w="SIL"/>
</channel>
</results>
s represents start time
d represents duration
w represents word
(the number of r tag is NOT fixed and changes from row to row of xml_table)
The idea now is to sift through each row, and within each XML, calculate the longest consecutive duration when a 'SIL' or 's' appears as a in the w attribute and then to return this in a new table as longest_pause (i.e longest consecutive SIL/s duration) with word_id and word_data also.
So in the above example xml we have three consecutive periods where the longest_pause can occur where the total durations are 100 (100), 900 (350+550) and 350 (200 + 50 + 100) and therefore the longest_pause is 900 so 900 would be returned.
I was wondering if anybody could help with this, so far i have:
DECLARE #xml XML
DECLARE #ordered_table TABLE (id VARCHAR(20) NOT NULL, start_time INT NOT NULL, duration INT NOT NULL, word VARCHAR(50) NOT NULL)
SELECT #xml = (SELECT word_data FROM xml_table where word_id = 1)
INSERT into #ordered_table_by_time(id, start_time, duration, word)
SELECT 'NAME' AS id, Tbl.Col.value('#s', 'INT'), Tbl.Col.value('#d', 'INT'), Tbl.Col.value('#w', 'varchar(50)') FROM #xml.nodes('/results/channel[#id="1"]/r') Tbl(Col)
i.e, I have created a table to put the XML into, but i do not know where to go from there,
Please can somebody help?
Thank you :)

Your attempt at solving this looks like you want to find the longest duration for one XML but the text suggests that you want to find the row in xml_table that has the longest duration.
Working with the one XML instance and modified version of your table variable you could do like this.
DECLARE #xml XML = '
<results>
<channel id="1">
<r s="0" d="100" w="SIL"/>
<r s="100" d="250" w="Sorry"/>
<r s="350" d="100" w="WHO"/>
<r s="450" d="350" w="SIL"/>
<r s="800" d="550" w="SIL"/>
<r s="1350" d="100" w="Hello"/>
<r s="1450" d="200" w="s"/>
<r s="1650" d="50" w="SIL"/>
<r s="1700" d="100" w="SIL"/>
</channel>
</results>';
DECLARE #ordered_table TABLE
(
id INT NOT NULL,
start_time INT NOT NULL,
duration INT NOT NULL,
word VARCHAR(50) NOT NULL
);
INSERT INTO #ordered_table(id, start_time, duration, word)
SELECT row_number() over(order by Tbl.Col.value('#s', 'INT')),
Tbl.Col.value('#s', 'INT'),
Tbl.Col.value('#d', 'INT'),
Tbl.Col.value('#w', 'varchar(50)')
FROM #xml.nodes('/results/channel[#id="1"]/r') Tbl(Col);
WITH C AS
(
SELECT T.id,
CASE WHEN T.word IN ('S', 'SIL') THEN T.duration ELSE 0 END AS Dur
FROM #ordered_table as T
WHERE T.ID = 1
UNION ALL
SELECT T.id,
CASE WHEN T.word IN ('S', 'SIL') THEN C.Dur + T.duration ELSE 0 END AS Dur
FROM #ordered_table as T
INNER JOIN C
ON T.ID = C.ID + 1
)
SELECT TOP(1) *
FROM C
ORDER BY C.Dur DESC;
SQL Fiddle
I added a ID field that is used in a recursive CTE to walk through the nodes and calculating a running sum where w is SIL or s. Then fetching the longest duration from the CTE using TOP(1) ... ORDER BY.
If you instead want the row in xml_table with the longest duration you can do like this.
with C as
(
select 1 as node,
X.word_id,
X.word_data,
case when T.W in ('S', 'SIL') then T.D else 0 end as duration
from dbo.xml_table as X
cross apply (select X.word_data.value('(/results/channel[#id = "1"]/r/#d)[1]', 'int'),
X.word_data.value('(/results/channel[#id = "1"]/r/#w)[1]', 'nvarchar(100)')) as T(D, W)
union all
select C.node + 1,
X.word_id,
X.word_data,
case when T.W in ('S', 'SIL') then T.D + C.duration else 0 end as duration
from C
inner join dbo.xml_table as X
on X.word_id = C.word_id
cross apply (select X.word_data.value('(/results/channel[#id = "1"]/r/#d)[sql:column("C.Node")+1][1]', 'int'),
X.word_data.value('(/results/channel[#id = "1"]/r/#w)[sql:column("C.Node")+1][1]', 'nvarchar(100)')) as T(D, W)
where T.W is not null
)
select T.word_id,
T.word_data,
T.duration
from
(
select row_number() over(partition by C.word_id order by C.duration desc) as rn,
C.word_id,
C.word_data,
C.duration
from C
) as T
where T.rn = 1
option (maxrecursion 0);
SQL Fiddle
The recursive CTE part works the same as before but but for multiple rows at the same time and it is getting the value for duration from the XML directly using the column node that is incremented for each iteration. The query against the CTE uses row_number() to find the longest duration for each row.

Have you considered using something like python instead?
You can query the SQL to get the data, then use regular expressions to extract the values from the XML, calculate the value wanted, then insert it back into the results table.
I recently did something slightly similar and decided doing the processing in python was a much easier way to do it if that's possible for you

Related

Using recursion with a CTE in SQL Server

I have following table Structure: (this is just a sample set with exact same columns in my final output query)
Actual data has a much higher number of rows in index and I have to remove few symbols before arriving to the index value. This is a custom index to be built for internal use.
https://dbfiddle.uk/?rdbms=sqlserver_2016&fiddle=b1d5ed7db79c665d8cc179ae4cc7d4f1
This is link to the fiddle for SQL data
below is the image of the same:
I want to calculate point contribution to the index value and finally the index value.
To calculate pts contribution by each symbol the formula is :
ptsC = yesterday_index * wt * px_change / yest_close
I do not have beginning value of yesterday Index .i.e for 17 Nov 2021 and should be considered as 1000
The Index Value of 18 Nov will then be 1000 + sum(ptsC)
This value should now be used to calculate ptsC for each symbol for 22-Nov and so on...
I am trying to write a recursive CTE but am not sure where I am going wrong.
Yesterday Index value should be recursively determined and thus the ptsC should be calculated.
The final output should be:
where total Point Contribution is sum of all the ptsC for the day and New index Value is yesterday Index Value + Total Point Contribution.
Below is the code I have which generates the first table:
declare #beginval as float=17671.65
set #beginval=1000
declare #indexname varchar(20)='NIFTY ENERGY'
declare #mindt as datetime
select #mindt=min(datetime) from indices_json where indexname=#indexname
;
with tbl as (
SELECT IndexName, datetime, sum(Indexmcap_today) totalMcap_today,sum(Indexmcap_yst) totalmcap_yst
FROM indices_json
WHERE IndexName = #indexname
group by indexname,datetime
)
,tbl2 as
(
select j.indexname,j.datetime,symbol,Indexmcap_today/d.totalMcap_today*100 calc_wt_today,Indexmcap_yst/d.totalmcap_yst*100 calc_wt_yest,iislPtsChange,adjustedClosePrice,pointchange
from indices_json j inner join tbl d on d.datetime=j.datetime and d.IndexName=j.IndexName
)
,tbl3 as
(
select indexname,datetime,symbol,calc_wt_today,calc_wt_yest,iislPtsChange,adjustedClosePrice,pointchange
,case when datetime=#mindt then #beginval*calc_wt_yest*iislPtsChange/adjustedClosePrice/100 else null end ptsC
from tbl2
)
,tbl4 as
(
select indexname,datetime,sum(ptsC) + #beginval NewIndexVal,sum(pointchange) PTSCC
from tbl3
group by indexname,datetime
)
,tbl5 as
(
select *,lag(datetime,1,null) over(order by datetime asc) yest_dt
from tbl4
)
,
tbl6 as
(
select d.*,s.yest_dt
from tbl2 d inner join tbl5 s on d.datetime=s.datetime
)
,tbl7 as
(
select d.IndexName,d.datetime,d.symbol,d.calc_wt_today,d.calc_wt_yest,d.iislPtsChange,d.adjustedClosePrice,d.pointchange,case when i.datetime is null then #beginval else i.NewIndexVal end yest_index
from tbl6 d left join tbl4 i on d.yest_dt=i.datetime
)
select IndexName,convert(varchar(12),datetime,106)date,symbol,round(calc_wt_yest,4) wt,iislPtsChange px_change,adjustedClosePrice yest_close--,pointchange,yest_index
from tbl7 d where datetime <='2021-11-24'
order by datetime
Thanks in advance.
I found a solution for this:
I calculated the returns for each constituent for each date
then summed up these returns for a date
then multiplied all the sum of the returns of all dates to arrive at the final value - this works
below is the query for the same. I did not require recursion here
declare #beginval as float=17671.65
declare #indexname varchar(20)='NIFTY 50'
declare #mindt as datetime
select #mindt=min(datetime) from indices_json where indexname=#indexname
declare #startdt as datetime = '2021-11-01'
;
with tbl as (
SELECT IndexName, datetime, sum(Indexmcap_today) totalMcap_today,sum(Indexmcap_yst) totalmcap_yst
FROM indices_json
WHERE IndexName = #indexname-- and symbol!='AXISBANK'
group by indexname,datetime
)
,tbl2 as
(
select j.indexname,j.datetime,symbol,Indexmcap_today/d.totalMcap_today*100 calc_wt_today,Indexmcap_yst/d.totalmcap_yst*100 calc_wt_yest,iislPtsChange,adjustedClosePrice,pointchange
from indices_json j inner join tbl d on d.datetime=j.datetime and d.IndexName=j.IndexName
)
,tbl7 as
(
select d.IndexName,d.datetime,d.symbol,d.calc_wt_today,d.calc_wt_yest,d.iislPtsChange,d.adjustedClosePrice,d.pointchange, d.calc_wt_yest*d.iislPtsChange/d.adjustedClosePrice/100 ret
from tbl2 d
)
,tbl8 as
(
select indexname,datetime,1+sum(ret) tot_ret from tbl7 group by indexname,datetime
)
select indexname,datetime date
,round(exp(sum(log(sum(tot_ret))) over (partition by IndexName order by datetime)),6)*#beginval final_Ret
from tbl8 where datetime>=#startdt
group by indexname,datetime order by date

Parse Columns in SQL by delimiter

In SQL Server, I have a table/view that has multiple columns. The last column looks like this:
COL
---------------------------------
|test|test|test11|testing|final
|test|test|test1|testing2|final3
|test|test|test17|testing|final6
How do parse this column by | and combine it with the right side of the existing table like such:
COL1 COL2 COL Parse1 Parse2 Parse3 Parse4 Parse5
1 4 |test|test|test11|testing|final test test test11 testing final
2 6 |test|test|test1|testing2|final3 test test test1 testing2 final3
5 9 |test|test|test17|testing|final6 test test test17 testing final6
There are the same number of parsings for column COL.
Any help would be great thanks!
Not clear if you have a leading | in the field COL. If so, you may want to shift /x[n]
The pattern is pretty clear. Easy to expand or contract as necessary
Example
Declare #YourTable Table ([COL] varchar(50))
Insert Into #YourTable Values
('test|test|test11|testing|final')
,('test|test|test1|testing2|final3')
,('test|test|test17|testing|final6')
Select A.*
,B.*
From #YourTable A
Cross Apply (
Select Pos1 = ltrim(rtrim(xDim.value('/x[1]','varchar(max)')))
,Pos2 = ltrim(rtrim(xDim.value('/x[2]','varchar(max)')))
,Pos3 = ltrim(rtrim(xDim.value('/x[3]','varchar(max)')))
,Pos4 = ltrim(rtrim(xDim.value('/x[4]','varchar(max)')))
,Pos5 = ltrim(rtrim(xDim.value('/x[5]','varchar(max)')))
,Pos6 = ltrim(rtrim(xDim.value('/x[6]','varchar(max)')))
,Pos7 = ltrim(rtrim(xDim.value('/x[7]','varchar(max)')))
,Pos8 = ltrim(rtrim(xDim.value('/x[8]','varchar(max)')))
,Pos9 = ltrim(rtrim(xDim.value('/x[9]','varchar(max)')))
From (Select Cast('<x>' + replace((Select replace(A.Col,'|','§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml) as xDim) as B1
) B
Returns

Dynamic Columns - SQL Server 2012

Having two tables, I want to convert some rows to columns. My database engine is Microsoft SQL Server. The image below illustrates my desired result.
Your question is not so clear, but it seems you want to use SQL PIVOT
Sample Data
DECLARE #tblModule TABLE(modId INT,name VARCHAR(200))
DECLARE #tblProfile TABLE(id INT,modId INT,profil VARCHAR(200))
INSERT INTO #tblModule
SELECT 1,'Manteniminento' UNION
SELECT 2 , 'Soporte'
INSERT INTO #tblProfile
SELECT 1,1,'Administrador' UNION
SELECT 2,2 , 'Empleado' UNION
SELECT 3,1 , 'Empleado' UNION
SELECT 4,1 , 'Empleado' UNION
SELECT 5,1 , 'Administrador' UNION
SELECT 6,1 , 'Administrador'
Main query
SELECT name,SUM([Administrador]) AS Administrador, SUM([Empleado]) AS Empleado
FROM
(SELECT id,p.modId,m.name,p.profil
FROM #tblProfile p
INNER JOIN #tblModule m ON m.modId = p.modId) AS SourceTable
PIVOT
(
COUNT(modId)
FOR profil IN ([Administrador], [Empleado])
) AS PivotTable
GROUP BY name
Result
name Administrador Empleado
Manteniminento 3 2
Soporte 0 1
I found the solution to what I needed. I share with you I made use of SQL STUFF:
SELECT A.codEmpresa
,A.nomEmpresa
,A.codSistema
,A.nomSistema
,A.codPerfil
,A.nomPerfil
,modulos = STUFF((SELECT DISTINCT ', ' + M.nomModulo
FROM smpseg.[0004] R
JOIN smpseg.[0014] SMOP
ON SMOP.codSistema = R.codSistema
AND SMOP.codModulo = R.codModulo
AND SMOP.codPerfil = A.codPerfil
AND SMOP.objDefault = CAST(1 AS BIT)
JOIN smpseg.[0011] O
ON O.codSistema = SMOP.codSistema
AND O.codModulo = SMOP.codModulo
AND O.codObjeto = SMOP.codObjeto
JOIN smpseg.[0010] M
ON M.codSistema = O.codSistema
AND M.codModulo = O.codModulo
WHERE R.codUsuario = #p_codUsuario
FOR XML PATH('')), 1, 2, '') FROM smpseg.[0004] A
JOIN smpseg.[0016] U
ON U.codUsuario = A.codUsuario

Issue in complicated join

I have 4 tables
tbLicenceTypesX (2 Fields)
LicenceTypes
LicenceTypesX
tbLicenceTypesX (Contains data like)
1 - Medical Licence
2 - Property
3 - Casualty
4 - Trainning Licence
tbProduct (3 feilds)
Product
ProductX
CompanyId (F.K)
LicenceTypes(F.K)
tbProduct (Contains data like)
1 - T.V - 10 - 2
2 - A.C - 30 - 3
3 - Mobiles - 40 -4
tbLicence (3 feilds)
Licence
LicenceTypesNames
AgentId
tbLicence (Contains data like)
1 - Property, Casualty - 23
2 - Trainning Licence, Casualty - 34
Now I have to Fetch Product and ProductX from tbProduct whose LicenceTypes matches with Agent's Licence in tbLicence in a Company.
For e.g: I have to fetch T.V Whose Licence Types is 2("Property") and Company Id is 10 which should be assigned to Agent where Agent Id is 23 and Whose LicenceTypesNames should also contains "Property"
I want to fetch something like
#CompanyId int,
#AgentId int
As
SELECT p.ProductX,p.Product
from tbProduct p
inner join tbLicence l on p.LicenceTypes = l.LicenceTypesNames<its corresponding Id>
inner join tbProduct c on c.Product =p.Product
where
c.CompanyId=#CompanyId
and l.AgentId=#AgentId
Please help me!!!
You can use XML and CROSS APPLY to Split the comma separated values and JOIN with tbProduct. The LTRIM and RTRIM functions are used to trim the comma separated values if they have excessive empty space. The below code gives you the desired output.
DECLARE #CompanyId int = 30, #AgentId int = 23
;WITH CTE AS
(
SELECT AgentId, TCT.LicenceTypes FROM
(
SELECT AgentId, LTRIM(RTRIM(Split.XMLData.value('.', 'VARCHAR(100)'))) LicenceTypesNames FROM
(
SELECT AgentID, Cast ('<M>' + REPLACE(LicenceTypesNames, ',', '</M><M>') + '</M>' AS XML) AS Data
FROM tbLicence
) AS XMLData
CROSS APPLY Data.nodes ('/M') AS Split(XMLData)
)
AS LTN
JOIN tbLicenceTypesX TCT ON LTN.LicenceTypesNames = tct.LicenceTypesX
)
SELECT p.ProductX,p.Product
FROM tbProduct P
JOIN CTE c on p.LicenceTypes = c.LicenceTypes
WHERE CompanyId = #CompanyId
AND AgentId = #AgentId
Sql Fiddle Demo

Update field in table

I have the following table PNLReference
PnlId LineTotalisationId Designation TypeTotalisation Totalisation
1 A Gross Fees Formule A01+A02+A03+A04+A05
2 A01 GF1 Comptes imputables 975800|758000|706900|706000|706430|706420|706410|706400|706530|706520|706510|706001|706401|706431|706531|706902
3 A02 GF2 Comptes imputables 706500|709400|706130|706120|706110|706100|706830|706820|706810|706800|706730|706720|706710|706700|706330|706101|706131|706331|706501|706701|706801|706831|709401|706731
I have filled table DimPNL as following
INSERT [dbo].[DimPNL] (
PNLCode
,PNLName
,PNLParentId
,Operator
)
SELECT *
FROM (
SELECT t.c.value('.', 'nvarchar(255)') AS PNLCode
,Ref.Designation AS PNLName
,split.LineTotalisationId AS PNLParentId
,split.Operator AS Operator
FROM (
SELECT tbl.Designation
,tbl.LineTotalisationId
,tbl.TypeTotalisation
,tbl.PnlId
,tbl.Totalisation
,CAST('<t>' + REPLACE(tbl.Totalisation, tbl.Operator, '</t><t>') + '</t>' AS XML) x
,tbl.Operator
FROM ##TTResults AS tbl
) split
CROSS APPLY x.nodes('/t') t(c)
INNER JOIN [dbo].[PNLReference] Ref
ON Ref.LineTotalisationId = t.c.value('.', 'nvarchar(255)')
) Result
table dimpnl contents a filed sign which have to be filled like that : if all numbers in Totalisation in table PNLReference starts with 7 the sign would be -1 else sign will be 1.How to do it ? any idea ?
Using CASE WHEN LEFT(Totalisation,1)='7' then -1 else 1 END [SIGN] will give you a field that you can take the MAX(sign), if it stays -1 then they all started with 7

Resources