SQL - String Manipulation - sql-server

Context:
I have a view in SQL Server that tracks parameters a user inputs when they run an SSRS report (ReportServer.dbo.ExecutionLog). About 50 report parameters are saved as a string in a single column with ntext datatype. I would like to break this single column up into multiple columns for each parameter.
Details:
I query the report parameters like this:
SELECT ReportID, [Parameters]
FROM ReportServer.dbo.ExecutionLog
WHERE ReportID in (N'redacted')
and [Status] in (N'rsSuccess')
ORDER BY TimeEnd DESC
And here's a small subset of what the results look like:
alpha=123&bravo=9%2C33%2C76%2C23&charlie=91&delta=29&echo=11%2F2%2F2018%2012%3A00%3A00%20AM&foxtrot=11%2F1%2F2030%2012%3A00%3A00%20AM
Quesitons:
How can I get the results to look like this:
SQL Server 2017 is Python friendly. Is Python a better language to use in this scenario just for parsing purposes?
I've seen similar topics posted here, here & here. The parameters are dynamic so parsing via SQL string functions that involve counting characters doesn't apply. This question is relevant to more people than just me because there's a large population of people using SSRS. Tracking & formatting parameters in a more digestible way is valuable for all users of SSRS.

Here is a way using the built in STRING_SPLIT. I'm just not sure what the logic is for the stuff AFTER the date, so I would discarded it but I left it for you to decide.
DEMO
declare #table table (ReportID int identity(1,1), [Parameters] varchar(8000))
insert into #table
values
('alpha=123&bravo=9%2C33%2C76%2C23&charlie=91&delta=29&echo=11%2F2%2F2018%2012%3A00%3A00%20AM&foxtrot=11%2F1%2F2030%2012%3A00%3A00%20AM')
,('alpha=457893&bravo=9%2C33%2C76%2C23&charlie=91&delta=29&echo=11%2F2%2F2018%2012%3A00%3A00%20AM&foxtrot=11%2F1%2F2030%2012%3A00%3A00%20AM')
select
ReportID
,[Parameters]
,alpha = max(iif(value like 'alpha%',substring(value,charindex('=',value) + 1,99),null))
,bravo = max(iif(value like 'bravo%',substring(value,charindex('=',value) + 1,99),null))
,charlie = max(iif(value like 'charlie%',substring(value,charindex('=',value) + 1,99),null))
,delta = max(iif(value like 'delta%',substring(value,charindex('=',value) + 1,99),null))
,echo = max(iif(value like 'echo%',substring(value,charindex('=',value) + 1,99),null))
,foxtrot = max(iif(value like 'foxtrot%',substring(value,charindex('=',value) + 1,99),null))
from #table
cross apply string_split(replace(replace([Parameters],'%2C',','),'%2F','/'),'&')
group by ReportID, [Parameters]
Or, if they aren't static you can use a dynamic pivot. It'll take some massaging to get your columns in the correct order.
DEMO
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX);
SET #cols = STUFF((SELECT distinct ',' + QUOTENAME(substring([value],0,charindex('=',[value])))
from myTable
cross apply string_split(replace(replace([Parameters],'%2C',','),'%2F','/'),'&')
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
select #cols
set #query = 'SELECT ReportID, ' + #cols + ' from
(
select ReportID
, ColName = substring([value],0,charindex(''='',[value]))
, ColVal = substring([value],charindex(''='',[value]) + 1,99)
from myTable
cross apply string_split(replace(replace([Parameters],''%2C'','',''),''%2F'',''/''),''&'')
) x
pivot
(
max(ColVal)
for ColName in (' + #cols + ')
) p '
execute(#query)

Split the string on the ampersand character.
Further split each row into two columns on the equals character.
In the second column, replace %2C with the comma character, and %2F with the forward-slash character, and so on with any other replacements as needed.
Use a dynamic-pivot to query the above in the format that you want.

Here's a method that starts with a lot of replaces.
To url-decode the string and transform it into an XML type.
Then it uses the XML functions to get the values for the columns.
Example snippet:
declare #Table table ([Parameters] varchar(200));
insert into #Table ([Parameters]) values
('alpha=123&bravo=9%2C33%2C76%2C23&charlie=91&delta=29&echo=11%2F2%2F2018%2012%3A00%3A00%20AM&foxtrot=11%2F1%2F2030%2012%3A00%3A00%20AM');
select
x.query('/x[key="alpha"]/val').value('.', 'int') as alpha,
x.query('/x[key="bravo"]/val').value('.', 'varchar(30)') as bravo,
x.query('/x[key="charlie"]/val').value('.', 'varchar(30)') as charlie,
x.query('/x[key="delta"]/val').value('.', 'varchar(30)') as delta,
convert(date, x.query('/x[key="echo"]/val').value('.', 'varchar(30)'), 103)as echo,
convert(date, x.query('/x[key="foxtrot"]/val').value('.', 'varchar(30)'), 103) as foxtrot
from #Table
cross apply (select cast('<x><key>'+
replace(replace(replace(replace(replace(
replace([Parameters],
'%2C',','),
'%2F','/'),
'%20',' '),
'%3A',':'),
'=','</key><val>'),
'&','</val></x><x><key>')
+'</val></x>' as XML) as x) ca
Test on db<>fiddle here

Related

SQL Server Regular expression extract pattern from DB colomn

I have a question about SQL Server: I have a database column with a pattern which is like this:
up to 10 digits
then a comma
up to 10 digits
then a semicolon
e.g.
100000161, 100000031; 100000243, 100000021;
100000161, 100000031; 100000243, 100000021;
and I want to extract within the pattern the first digits (up to 10) (1.) and then a semicolon (4.)
(or, in other words, remove everything from the semicolon to the next semicolon)
100000161; 100000243; 100000161; 100000243;
Can you please advice me how to establish this in SQL Server? Im not very familiar with regex and therefore have no clue how to fix this.
Thanks,
Alex
Try this
Declare #Sql Table (SqlCol nvarchar(max))
INSERT INTO #Sql
SELECT'100000161,100000031;100000243,100000021;100000161,100000031;100000243,100000021;'
;WITH cte
AS (SELECT Row_number()
OVER(
ORDER BY (SELECT NULL)) AS Rno,
split.a.value('.', 'VARCHAR(1000)') AS Data
FROM (SELECT Cast('<S>'
+ Replace( Replace(sqlcol, ';', ','), ',',
'</S><S>')
+ '</S>'AS XML) AS Data
FROM #Sql)AS A
CROSS apply data.nodes('/S') AS Split(a))
SELECT Stuff((SELECT '; ' + data
FROM cte
WHERE rno%2 <> 0
AND data <> ''
FOR xml path ('')), 1, 2, '') AS ExpectedData
ExpectedData
-------------
100000161; 100000243; 100000161; 100000243
I believe this will get you what you are after as long as that pattern truly holds. If not it's fairly easy to ensure it does conform to that pattern and then apply this
Select Substring(TargetCol, 1, 10) + ';' From TargetTable
You can take advantage of SQL Server's XML support to convert the input string into an XML value and query it with XQuery and XPath expressions.
For example, the following query will replace each ; with </b><a> and each , to </a><b> to turn each string into <a>100000161</a><a>100000243</a><a />. After that, you can select individual <a> nodes with /a[1], /a[2] :
declare #table table (it nvarchar(200))
insert into #table values
('100000161, 100000031; 100000243, 100000021;'),
('100000161, 100000031; 100000243, 100000021;')
select
xCol.value('/a[1]','nvarchar(200)'),
xCol.value('/a[2]','nvarchar(200)')
from (
select convert(xml, '<a>'
+ replace(replace(replace(it,';','</b><a>'),',','</a><b>'),' ','')
+ '</a>')
.query('a') as xCol
from #table) as tmp
-------------------------
A1 A2
100000161 100000243
100000161 100000243
value extracts a single value from an XML field. nodes returns a table of nodes that match the XPath expression. The following query will return all "keys" :
select
a.value('.','nvarchar(200)')
from (
select convert(xml, '<a>'
+ replace(replace(replace(it,';','</b><a>'),',','</a><b>'),' ','')
+ '</a>')
.query('a') as xCol
from #table) as tmp
cross apply xCol.nodes('a') as y(a)
where a.value('.','nvarchar(200)')<>''
------------
100000161
100000243
100000161
100000243
With 200K rows of data though, I'd seriously consider transforming the data when loading it and storing it in indivisual, indexable columns, or add a separate, related table. Applying string manipulation functions on a column means that the server can't use any covering indexes to speed up queries.
If that's not possible (why?) I'd consider at least adding a separate XML-typed column that would contain the same data in XML form, to allow the creation of an XML index.

I am not getting values by passing variable using IN query in SQL

I am passing string values from my code like '12th Standard/Ordinary National Diploma,Higher National Diploma' to SQL query, but I am not getting any values and nothing showing any result.
My SQL query:
declare #qua varchar(250),#final varchar(250),#Qualification varchar(250)
set #Qualification= '12th Standard/Ordinary National Diploma,Higher National Diploma'
set #qua =replace(#Qualification,',',''',''')
set #final= ''''+#qua+''''
select * from mytablename in(#final)
Result: Data is not displaying
Thank you in advance.
Instead do it using a table variable like
declare #tbl table(qual varchar(250));
insert into #tbl
select '12th Standard/Ordinary National Diploma'
union
select 'Higher National Diploma';
select * from mytablename where somecolumn in(select qual from #tbl);
Despite trying to put quote marks in there, you're still only passing a single string to the IN. The string just contains embedded quotes and SQL Server is looking for that single long string.
You also don't seem to be comparing a column for the IN.
Your best bet is to pass in multiple string variables, but if that's not possible then you'll have to write a function that parses a single string into a resultset and use that. For example:
SELECT
Column1, -- Because we never use SELECT *
Column2
FROM
MyTableName
WHERE
qualification IN (SELECT qualification FROM dbo.fn_ParseString(#qualifications))
You can insert all your search criteria in one table and then can easily do a lookup on the main table, example below:
DECLARE #MyTable TABLE (Name VARCHAR(10), Qualification VARCHAR(50))
DECLARE #Search TABLE (Qualifications VARCHAR(50))
INSERT INTO #MyTable VALUES ('User1','12th Standard'), ('User2','Some Education'),
('User3','Ordinary National Diploma'), ('User4','Some Degree'),
('User5','Higher National Diploma')
INSERT INTO #Search VALUES ('12th Standard'),('Ordinary National Diploma'),('Higher National Diploma')
SELECT MT.*
FROM #MyTable MT
INNER JOIN (SELECT Qualifications FROM #Search) S ON S.Qualifications = MT.Qualification
As previous said, you are passing a string with commas, not comma separated values. It needs to be split up into separate values.
You can do this by passing the qualification string into XML which you can use to turn it into separate rows of data.
The IN parameter will then accept the data as separate values.
DECLARE #Qualifications as varchar(150) = '12th Standard/Ordinary National Diploma,Higher National Diploma'
Declare #Xml XML;
SET #Xml = N'<root><r>' + replace(#Qualifications, char(44),'</r><r>') + '</r></root>';
select *
from MyTableName
Where MyTableName.Qualification in
(select r.value('.','varchar(max)') as item
from #Xml.nodes('//root/r') as records(r))
Alternatively you can create a table-valued function that splits according to input like in your case its ',' and then INNER JOIN with the returnColumnname and that particular column that you want to filter
SELECT COLUMNS, . . . .
FROM MyTableName mtn
INNER JOIN dbo.FNASplitToTable(#qualifications, ',') csvTable
ON csvTable.returnColumnName = mtn.somecolumn
Table Valued function might be like:
CREATE FUNCTION dbo.FNASplitToTable (#string varchar(MAX), #splitType CHAR(1))
RETURNS #result TABLE(Value VARCHAR(100))
AS
BEGIN
DECLARE #x XML
SELECT #x = CAST('<A>' + REPLACE(#string, #splitType, '</A><A>') + '</A>' AS XML)
INSERT INTO #result
SELECT LTRIM(t.value('.', 'VARCHAR(100)')) AS inVal
FROM #x.nodes('/A') AS x(t)
RETURN
END
GO

Stored procedure with inner join using coalesce

I have a simple table tblAllUsers which stores simple values like Name,Date Of Birth etc of a UserId.
Another table tblInterest stores the interest(s) of a UserId.Here a user may have any number of Interest and are stored seperately in separate rows :
Create table tblInterest
(
Id int primary key identity,
UserId varchar(10),
InterestId int,
Interest varchar(20)
)
So when i want to display the set of Interest together of a particular user, I use the below query :
DECLARE #listStr VARCHAR(MAX)
SELECT #listStr = COALESCE(#listStr + ', ' ,'') + Interest FROM tblInterest where UserId=#UserId
SELECT #listStr
Now, want to display a users info from both these tables wherein the Interest(S) are displayed in ONE string.
I have tried the below ;
Create proc spPlayersGridview
#listStr VARCHAR(MAX)
as
begin
Select tblAllUsers.Category, tblAllUsers.DOB, tblAllUsers.FirstName, tblAllUsers.LastName, tblAllUsers.City, tblAllUsers.State,
#listStr = COALESCE(#listStr + ', ' ,'') + tblInterest.Interest
from tblAllUsers
INNER JOIN tblInterest
ON tblAllUsers.UserId=tblInterest.UserId
where Category='Player'
end
throws an exception "A SELECT statement that assigns a value to a variable must not be combined with data-retrieval operations."
I had a similar problem a while back, and a bit of SQL STUFF magic helps - Maybe it will work for you as well.
CREATE PROC spPlayersGridview
AS
BEGIN
SELECT
tblAllUsers.Category
, tblAllUsers.DOB
, tblAllUsers.FirstName
, tblAllUsers.LastName
, tblAllUsers.City
, tblAllUsers.State
, listStr = STUFF((
SELECT ',' + tblInterest.Interest
FROM tblInterest
WHERE tblAllUsers.UserId=tblInterest.UserId
ORDER BY tblInterest.Interest
FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 1, '')
FROM tblAllUsers
WHERE Category='Player'
END
Hope it helps - For more reading look at: https://msdn.microsoft.com/en-us/library/ms188043.aspx

sql server stored proc dynamic select

I have a stored proc in the following format
create PROCEDURE [dbo].[test proc]
#identifier varchar(20),
#issuerName varchar(max),
#max_records int=1000
AS
BEGIN
declare #select nvarchar(30)
SELECT #identifier as '#identifier'
, (
SELECT
MoodysOrgID as '#MoodysOrgID'
,ReportDate as '#ReportDate'
,m.UpdateTime as '#UpdateTime'
,m.FileCreationDate as '#FileCreationDate'
from mfm_financial_ratios m
inner join mfm_financial_ratios_coa c on c.AcctNo = m.AcctNo
where ReportDate in (select distinct top (#max_records) reportdate from mfm_financial_ratios where MoodysOrgID = m.MoodysOrgID)
and m.MoodysOrgID=(select top 1 IssuerID_Moodys as id from loans where LIN=#identifier or LoanXID=#identifier
and ParentName_Moodys=#issuerName and IssuerID_Moodys is not null)
order by ReportDate desc
FOR XML PATH('FinRatios'), TYPE
)
FOR XML PATH('FinRatiosHistory')
END
but i would like to make by query execute as dynamic sql
and my stored proc looks like
create PROCEDURE [dbo].[test proc]
#identifier varchar(20),
#issuerName varchar(max),
#max_records int=1000
AS
BEGIN
declare #select nvarchar(30)
set #select = N'SELECT #identifier as '#identifier'
, (
SELECT
MoodysOrgID as '#MoodysOrgID'
,ReportDate as '#ReportDate'
,m.UpdateTime as '#UpdateTime'
,m.FileCreationDate as '#FileCreationDate'
from mfm_financial_ratios m
inner join mfm_financial_ratios_coa c on c.AcctNo = m.AcctNo
where ReportDate in (select distinct top (#max_records) reportdate from mfm_financial_ratios where MoodysOrgID = m.MoodysOrgID)
and m.MoodysOrgID=(select top 1 IssuerID_Moodys as id from loans where LIN=#identifier or LoanXID=#identifier
and ParentName_Moodys=#issuerName and IssuerID_Moodys is not null)
order by ReportDate desc
FOR XML PATH('FinRatios'), TYPE
)
FOR XML PATH('FinRatiosHistory')'
exec #select
END
The following stored proc gives issues because of the comma used in it .Can someone let me know what you be the correct way of doing it
The problem are not the commas. You mostly have two problems: one, you're not escaping the quotes correctly. And two, you're not concatenating your variables correctly. Here's an example of both:
For concatenating variables: In your first select line, you cannot do this:
SELECT #identifier as '#identifier'
because sql does not know what to do with #identifier that way. You should concatenate the variable this way:
SELECT #identifier as ' + #identifier + '.. everything else goes here
Also, when you will have to concatenate max_records, since it's an int variable you should cast it to varchar first, like this:
select distinct top (' + cast(#max_records as varchar(10) + ') ....
Whenever you're using a variable in the middle of the string (such as #max_records) you HAVE to concatenate it in order for SQL to know it's a variable and not just a string. You didn't do it with max_records, #issuerName, etc.
For escaping quotes: You need to escape your single quotes when you don't want your select string to unexpectedly end. For example here:
FOR XML PATH('FinRatiosHistory')'
You should escape them with double quotes (google escaping single quotes sql if you don't get it)
FOR XML PATH(''FinRatiosHistory'')'

How do I parse through a string that is dynamic?

How can I parse through dynamic strings to pull out data? Example Below: (Written in T-SQL for SQL Server 2008 R2)
The Date to parse through:
BUSINESS=12^REFERENCE=9255^ACCOUNT_TYPE=SUPPLIER^SHIPPING_ID=ACHP^
I need the REFERENCE number when the ACCOUNT_TYPE=SUPPLIER.
The Reference Number can be from 1 to 16 characters in length.
My SQL-statement would look something like this:
SELECT <REFERENCE NUMBER> FROM ACCOUNTS
The result would look something like this:
9255
84
1
151221
415
99
etc...
You should normalize your data. But here is a solution.
declare #accounts table(col1 varchar(max))
insert #accounts values('BUSINESS=12^REFERENCE=9255^ACCOUNT_TYPE=SUPPLIER^SHIPPING_ID=ACHP^')
SELECT replace(data, 'REFERENCE=', '') FROM
(
SELECT t.c.value('.', 'VARCHAR(2000)') data
FROM (
SELECT x = CAST('<t>' +
REPLACE(col1, '^', '</t><t>') + '</t>' AS XML)
FROM #accounts
WHERE col1 like '%ACCOUNT_TYPE=SUPPLIER%'
) a
CROSS APPLY x.nodes('/t') t(c)
) x
WHERE data like 'REFERENCE=%'
Result:
9255
Fix your data, that will save you from alot of grief
This will give you the number you are looking for from each row containing the example you gave:
DECLARE #text VARCHAR(50)
SET #text = 'BUSINESS=12^REFERENCE=9255^ACCOUNT_TYPE=SUPPLIER^SHIPPING_ID=ACHP^'
SELECT SUBSTRING(#text,CHARINDEX('^REFERENCE',#text)+11,
(CHARINDEX('^ACCOUNT_TYPE',#text))
- (CHARINDEX('^REFERENCE',#text)+11))
Just replace the #text with the field name and your are good to go!

Resources