How do I parse through a string that is dynamic? - sql-server

How can I parse through dynamic strings to pull out data? Example Below: (Written in T-SQL for SQL Server 2008 R2)
The Date to parse through:
BUSINESS=12^REFERENCE=9255^ACCOUNT_TYPE=SUPPLIER^SHIPPING_ID=ACHP^
I need the REFERENCE number when the ACCOUNT_TYPE=SUPPLIER.
The Reference Number can be from 1 to 16 characters in length.
My SQL-statement would look something like this:
SELECT <REFERENCE NUMBER> FROM ACCOUNTS
The result would look something like this:
9255
84
1
151221
415
99
etc...

You should normalize your data. But here is a solution.
declare #accounts table(col1 varchar(max))
insert #accounts values('BUSINESS=12^REFERENCE=9255^ACCOUNT_TYPE=SUPPLIER^SHIPPING_ID=ACHP^')
SELECT replace(data, 'REFERENCE=', '') FROM
(
SELECT t.c.value('.', 'VARCHAR(2000)') data
FROM (
SELECT x = CAST('<t>' +
REPLACE(col1, '^', '</t><t>') + '</t>' AS XML)
FROM #accounts
WHERE col1 like '%ACCOUNT_TYPE=SUPPLIER%'
) a
CROSS APPLY x.nodes('/t') t(c)
) x
WHERE data like 'REFERENCE=%'
Result:
9255
Fix your data, that will save you from alot of grief

This will give you the number you are looking for from each row containing the example you gave:
DECLARE #text VARCHAR(50)
SET #text = 'BUSINESS=12^REFERENCE=9255^ACCOUNT_TYPE=SUPPLIER^SHIPPING_ID=ACHP^'
SELECT SUBSTRING(#text,CHARINDEX('^REFERENCE',#text)+11,
(CHARINDEX('^ACCOUNT_TYPE',#text))
- (CHARINDEX('^REFERENCE',#text)+11))
Just replace the #text with the field name and your are good to go!

Related

Split string to array using delimiter, getting second to last element in SELECT Statement

Heads!
In my database, I have a column that contains the following data (examples):
H-01-01-02-01
BLE-01-03-01
H-02-05-1.1-03
The task is to get the second to last element of the array if you would split that using the "-" character. The strings are of different length.
So this would be the result using the above mentioned data:
02
03
1.1
Basically I'm searching for an equivalent of the following ruby-statement for use in a Select-Statement in SQL-Server:
"BLE-01-03-01".split("-")[-2]
Is this possible in any way in SQL Server? After spending some time searching for a solution, I only found ones that work for the last or first element.
Thanks very much for any clues or solutions!
PS: Version of SQL Server is Microsoft SQL Server 2012
As an alternative you can try this:.
--A mockup table with some test data to simulate your issue
DECLARE #mockupTable TABLE (ID INT IDENTITY, YourColumn VARCHAR(50));
INSERT INTO #mockupTable VALUES
('H-01-01-02-01')
,('BLE-01-03-01')
,('H-02-05-1.1-03');
--The query
SELECT CastedToXml.value('/x[sql:column("CountOfFragments")-1][1]','nvarchar(10)') AS TheWantedFragment
FROM #mockupTable t
CROSS APPLY(SELECT CAST('<x>' + REPLACE(t.YourColumn,'-','</x><x>') + '</x>' AS XML))A(CastedToXml)
CROSS APPLY(SELECT CastedToXml.value('count(/x)','int')) B(CountOfFragments);
The idea in short:
The first APPLY will transform the string to a XML like this
<x>H</x>
<x>01</x>
<x>01</x>
<x>02</x>
<x>01</x>
The second APPLY will xquery into this XML to get the count of fragments. As APPLY will add this as a column to the result set, we can use the value using sql:column() to get the wanted fragment by its position.
As I wrote in my comment - using charindex with reverse.
First, create and populate sample table (Please save us this step in your future questions):
DECLARE #T AS TABLE
(
Col Varchar(100)
);
INSERT INTO #T (Col) VALUES
('H-01-01-02-01'),
('BLE-01-03-01'),
('H-02-05-1.1-03');
The query:
SELECT Col,
LEFT(RIGHT(Col, AlmostLastDelimiter-1), AlmostLastDelimiter - LastDelimiter - 1) As SecondToLast
FROM #T
CROSS APPLY (SELECT CharIndex('-', Reverse(Col)) As LastDelimiter) As A
CROSS APPLY (SELECT CharIndex('-', Reverse(Col), LastDelimiter+1) As AlmostLastDelimiter) As B
Results:
Col SecondToLast
H-01-01-02-01 02
BLE-01-03-01 03
H-02-05-1.1-03 1.1
Similar to Zohar's solution, but using CTEs instead of CROSS APPLY to prevent redundancy. I personally find this easier to follow, as you can see what happens in each step. Doesn't make it a better solution though ;)
DECLARE #strings TABLE (data VARCHAR(50));
INSERT INTO #strings VALUES ('H-01-01-02-01') , ('BLE-01-03-01'), ('H-02-05-1.1-03');
WITH rev AS (
SELECT
data,
REVERSE(data) AS reversed
FROM
#strings),
first_hyphen AS (
SELECT
data,
reversed,
CHARINDEX('-', reversed) + 1 AS first_pos
FROM
rev),
second_hyphen AS (
SELECT
data,
reversed,
first_pos,
CHARINDEX('-', reversed, first_pos) AS second_pos
FROM
first_hyphen)
SELECT
data,
REVERSE(SUBSTRING(reversed, first_pos, second_pos - first_pos)) AS result
FROM
second_hyphen;
Results:
data result
H-01-01-02-01 02
BLE-01-03-01 03
H-02-05-1.1-03 1.1
Try this
declare #input NVARCHAR(100)
declare #dlmt NVARCHAR(3);
declare #pos INT = 2
SET #input=REVERSE(N'H-02-05-1.1-03');
SET #dlmt=N'-';
SELECT
CAST(N'<x>'
+ REPLACE(
(SELECT REPLACE(#input,#dlmt,'#DLMT#') AS [*] FOR XML PATH(''))
,N'#DLMT#',N'</x><x>'
) + N'</x>' AS XML).value('/x[sql:variable("#pos")][1]','nvarchar(max)');

SQL - String Manipulation

Context:
I have a view in SQL Server that tracks parameters a user inputs when they run an SSRS report (ReportServer.dbo.ExecutionLog). About 50 report parameters are saved as a string in a single column with ntext datatype. I would like to break this single column up into multiple columns for each parameter.
Details:
I query the report parameters like this:
SELECT ReportID, [Parameters]
FROM ReportServer.dbo.ExecutionLog
WHERE ReportID in (N'redacted')
and [Status] in (N'rsSuccess')
ORDER BY TimeEnd DESC
And here's a small subset of what the results look like:
alpha=123&bravo=9%2C33%2C76%2C23&charlie=91&delta=29&echo=11%2F2%2F2018%2012%3A00%3A00%20AM&foxtrot=11%2F1%2F2030%2012%3A00%3A00%20AM
Quesitons:
How can I get the results to look like this:
SQL Server 2017 is Python friendly. Is Python a better language to use in this scenario just for parsing purposes?
I've seen similar topics posted here, here & here. The parameters are dynamic so parsing via SQL string functions that involve counting characters doesn't apply. This question is relevant to more people than just me because there's a large population of people using SSRS. Tracking & formatting parameters in a more digestible way is valuable for all users of SSRS.
Here is a way using the built in STRING_SPLIT. I'm just not sure what the logic is for the stuff AFTER the date, so I would discarded it but I left it for you to decide.
DEMO
declare #table table (ReportID int identity(1,1), [Parameters] varchar(8000))
insert into #table
values
('alpha=123&bravo=9%2C33%2C76%2C23&charlie=91&delta=29&echo=11%2F2%2F2018%2012%3A00%3A00%20AM&foxtrot=11%2F1%2F2030%2012%3A00%3A00%20AM')
,('alpha=457893&bravo=9%2C33%2C76%2C23&charlie=91&delta=29&echo=11%2F2%2F2018%2012%3A00%3A00%20AM&foxtrot=11%2F1%2F2030%2012%3A00%3A00%20AM')
select
ReportID
,[Parameters]
,alpha = max(iif(value like 'alpha%',substring(value,charindex('=',value) + 1,99),null))
,bravo = max(iif(value like 'bravo%',substring(value,charindex('=',value) + 1,99),null))
,charlie = max(iif(value like 'charlie%',substring(value,charindex('=',value) + 1,99),null))
,delta = max(iif(value like 'delta%',substring(value,charindex('=',value) + 1,99),null))
,echo = max(iif(value like 'echo%',substring(value,charindex('=',value) + 1,99),null))
,foxtrot = max(iif(value like 'foxtrot%',substring(value,charindex('=',value) + 1,99),null))
from #table
cross apply string_split(replace(replace([Parameters],'%2C',','),'%2F','/'),'&')
group by ReportID, [Parameters]
Or, if they aren't static you can use a dynamic pivot. It'll take some massaging to get your columns in the correct order.
DEMO
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX);
SET #cols = STUFF((SELECT distinct ',' + QUOTENAME(substring([value],0,charindex('=',[value])))
from myTable
cross apply string_split(replace(replace([Parameters],'%2C',','),'%2F','/'),'&')
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
select #cols
set #query = 'SELECT ReportID, ' + #cols + ' from
(
select ReportID
, ColName = substring([value],0,charindex(''='',[value]))
, ColVal = substring([value],charindex(''='',[value]) + 1,99)
from myTable
cross apply string_split(replace(replace([Parameters],''%2C'','',''),''%2F'',''/''),''&'')
) x
pivot
(
max(ColVal)
for ColName in (' + #cols + ')
) p '
execute(#query)
Split the string on the ampersand character.
Further split each row into two columns on the equals character.
In the second column, replace %2C with the comma character, and %2F with the forward-slash character, and so on with any other replacements as needed.
Use a dynamic-pivot to query the above in the format that you want.
Here's a method that starts with a lot of replaces.
To url-decode the string and transform it into an XML type.
Then it uses the XML functions to get the values for the columns.
Example snippet:
declare #Table table ([Parameters] varchar(200));
insert into #Table ([Parameters]) values
('alpha=123&bravo=9%2C33%2C76%2C23&charlie=91&delta=29&echo=11%2F2%2F2018%2012%3A00%3A00%20AM&foxtrot=11%2F1%2F2030%2012%3A00%3A00%20AM');
select
x.query('/x[key="alpha"]/val').value('.', 'int') as alpha,
x.query('/x[key="bravo"]/val').value('.', 'varchar(30)') as bravo,
x.query('/x[key="charlie"]/val').value('.', 'varchar(30)') as charlie,
x.query('/x[key="delta"]/val').value('.', 'varchar(30)') as delta,
convert(date, x.query('/x[key="echo"]/val').value('.', 'varchar(30)'), 103)as echo,
convert(date, x.query('/x[key="foxtrot"]/val').value('.', 'varchar(30)'), 103) as foxtrot
from #Table
cross apply (select cast('<x><key>'+
replace(replace(replace(replace(replace(
replace([Parameters],
'%2C',','),
'%2F','/'),
'%20',' '),
'%3A',':'),
'=','</key><val>'),
'&','</val></x><x><key>')
+'</val></x>' as XML) as x) ca
Test on db<>fiddle here

How to remove a string that left of character `;` and the contained string `U` and then display it?

I have a table and the values like this
000001U;000002;000003U;000004;000005U;000006U
and I want display the field is like
000002;000004;
Try This
DECLARE #Table AS TABLE (Data nvarchar(1000))
INSERT INTO #Table
SELECT '000001U;000002;000003U;000004;000005U;000006U'
SELECT STUFF((SELECT '; '+Data
FROM
(
SELECT Split.a.value('.','nvarchar(1000)') AS Data
FROM
(
SELECT
CAST('<S>'+REPLACE(Data,';','</S><S>') +'</S>' AS XML ) AS Data
FROM #Table
)AS A
CROSS APPLY Data.nodes('S') AS Split(a)
)dt
WHERE CHARINDEX('U',Data)=0 FOR XML PATH('')),1,1,'') AS Data
Result
Data
---------
000002; 000004
As mentioned in the comments, SQL Server does not have any native regex replacement support. But, if you can get a dump of your entire table/column, then you can easily do a regex replacement in another tool, such as Notepad++.
Do a find on this pattern:
[0-9]+U;?
And then just replace with empty string. This should leave each row with the data you want to see. Here is a demo showing that this works in Java.
Demo
for SQL Server 2016 and later.
select stuff (
(select ',' + value
from STRING_SPLIT ('000001U;000002;000003U;000004;000005U;000006U', ';')
where right(value, 1) <> 'U'
for xml path('')),
1, 1, '')
for earlier version, you may use any CSV Spliter like this from Jeff Moden http://www.sqlservercentral.com/articles/Tally+Table/72993/
Simple way is to determine the value by IsNumeric function.
DECLARE #GIVEN VARCHAR(MAX)='000001U;000002;000003U;000004;000005U;000006U';
DECLARE #FINAL VARCHAR(MAX)='';
SELECT #FINAL =#FINAL+ case when ISNUMERIC(val)=1 then val+';' else '' end FROM (
SELECT split.x.value('.','varchar(max)') VAL FROM(
SELECT CAST('<M>'+REPLACE(#GIVEN,';','</M><M>')+'</M>' AS XML) AS VAL
)A
CROSS APPLY a.VAL.nodes('/M') as split(x)
)AA
PRINT #FINAL
Result: 000002;000004;

Concatenate the result of an ordered String_Split in a variable

In a SqlServer database I use, the database name is something like StackExchange.Audio.Meta, or StackExchange.Audio or StackOverflow . By sheer luck this is also the url for a website. I only need split it on the dots and reverse it: meta.audio.stackexchange. Adding http:// and .com and I'm done. Obviously Stackoverflow doesn't need any reversing.
Using the SqlServer 2016 string_split function I can easy split and reorder its result:
select value
from string_split(db_name(),'.')
order by row_number() over( order by (select 1)) desc
This gives me
| Value |
-----------------
| Meta |
| Audio |
| StackExchange |
As I need to have the url in a variable I hoped to concatenate it using this answer so my attempt looks like this:
declare #revname nvarchar(150)
select #revname = coalesce(#revname +'.','') + value
from string_split(db_name(),'.')
order by row_number() over( order by (select 1)) desc
However this only returns me the last value, StackExchange. I already noticed the warnings on that answer that this trick only works for certain execution plans as explained here.
The problem seems to be caused by the order by clause. Without that I get all values, but then in the wrong order. I tried to a add ltrimand rtrim function as suggested in the Microsoft article as well as a subquery but so far without luck.
Is there a way I can nudge the Sql Server 2016 Query Engine to concatenate the ordered result from that string_split in a variable?
I do know I can use for XML or even a plain cursor to get the result I need but I don't want to give up this elegant solution yet.
As I'm running this on the Stack Exchange Data Explorer I can't use functions, as we lack the permission to create those. I can do Stored procedures but I hoped I could evade those.
I prepared a SEDE Query to experiment with. The database names to expect are either without dots, aka StackOverflow, with 1 dot: StackOverflow.Meta or 2 dots, `StackExchange.Audio.Meta, the full list of databases is here
I think you are over-complicating things. You could use PARSENAME:
SELECT 'http://' + PARSENAME(db_name(),1) +
ISNULL('.' + PARSENAME(db_name(),2),'') + ISNULL('.'+PARSENAME(db_name(),3),'')
+ '.com'
This is exactly why I have the Presentation Sequence (PS) in my split function. People often scoff at using a UDF for such items, but it is generally a one-time hit to parse something for later consumption.
Select * from [dbo].[udf-Str-Parse]('meta.audio.stackexchange','.')
Returns
Key_PS Key_Value
1 meta
2 audio
3 stackexchange
The UDF
CREATE FUNCTION [dbo].[udf-Str-Parse] (#String varchar(max),#delimeter varchar(10))
--Usage: Select * from [dbo].[udf-Str-Parse]('meta.audio.stackexchange','.')
-- Select * from [dbo].[udf-Str-Parse]('John Cappelletti was here',' ')
-- Select * from [dbo].[udf-Str-Parse]('id26,id46|id658,id967','|')
Returns #ReturnTable Table (Key_PS int IDENTITY(1,1) NOT NULL , Key_Value varchar(max))
As
Begin
Declare #intPos int,#SubStr varchar(max)
Set #IntPos = CharIndex(#delimeter, #String)
Set #String = Replace(#String,#delimeter+#delimeter,#delimeter)
While #IntPos > 0
Begin
Set #SubStr = Substring(#String, 0, #IntPos)
Insert into #ReturnTable (Key_Value) values (#SubStr)
Set #String = Replace(#String, #SubStr + #delimeter, '')
Set #IntPos = CharIndex(#delimeter, #String)
End
Insert into #ReturnTable (Key_Value) values (#String)
Return
End
Probably less elegant solution but it takes only a few lines and works with any number of dots.
;with cte as (--build xml
select 1 num, cast('<str><s>'+replace(db_name(),'.','</s><s>')+'</s></str>' as xml) str
)
,x as (--make table from xml
select row_number() over(order by num) rn, --add numbers to sort later
t.v.value('.[1]','varchar(50)') s
from cte cross apply cte.str.nodes('str/s') t(v)
)
--combine into string
select STUFF((SELECT '.' + s AS [text()]
FROM x
order by rn desc --in reverse order
FOR XML PATH('')
), 1, 1, '' ) name
Is there a way I can nudge the Sql Server 2016 Query Engine to concatenate the ordered result from that string_split in a variable?
You can just use CONCAT:
DECLARE #URL NVARCHAR(MAX)
SELECT #URL = CONCAT(value, '.', #URL) FROM STRING_SPLIT(DB_NAME(), '.')
SET #URL = CONCAT('http://', LOWER(#URL), 'com');
The reversal is accomplished by the order of parameters to CONCAT. Here's an example.
It changes StackExchange.Garage.Meta to http://meta.garage.stackexchange.com.
This can be used to split and reverse strings in general, but note that it does leave a trailing delimiter. I'm sure you could add some logic or a COALESCE in there to make that not happen.
Also note that vNext will be adding STRING_AGG.
To answer the 'X' of this XY problem, and to address the HTTPS switch (especially for Meta sites) and some other site name changes, I've written the following SEDE query which outputs all site names in the format used on the network site list.
SELECT name,
LOWER('https://' +
IIF(PATINDEX('%.Mathoverflow%', name) > 0,
IIF(PATINDEX('%.Meta', name) > 0, 'meta.mathoverflow.net', 'mathoverflow.net'),
IIF(PATINDEX('%.Ubuntu%', name) > 0,
IIF(PATINDEX('%.Meta', name) > 0, 'meta.askubuntu.com', 'askubuntu.com'),
IIF(PATINDEX('StackExchange.%', name) > 0,
CASE SUBSTRING(name, 15, 200)
WHEN 'Audio' THEN 'video'
WHEN 'Audio.Meta' THEN 'video.meta'
WHEN 'Beer' THEN 'alcohol'
WHEN 'Beer.Meta' THEN 'alcohol.meta'
WHEN 'CogSci' THEN 'psychology'
WHEN 'CogSci.Meta' THEN 'psychology.meta'
WHEN 'Garage' THEN 'mechanics'
WHEN 'Garage.Meta' THEN 'mechanics.meta'
WHEN 'Health' THEN 'medicalsciences'
WHEN 'Health.Meta' THEN 'medicalsciences.meta'
WHEN 'Moderators' THEN 'communitybuilding'
WHEN 'Moderators.Meta' THEN 'communitybuilding.meta'
WHEN 'Photography' THEN 'photo'
WHEN 'Photography.Meta' THEN 'photo.meta'
WHEN 'Programmers' THEN 'softwareengineering'
WHEN 'Programmers.Meta' THEN 'softwareengineering.meta'
WHEN 'Vegetarian' THEN 'vegetarianism'
WHEN 'Vegetarian.Meta' THEN 'vegetarianism.meta'
WHEN 'Writers' THEN 'writing'
WHEN 'Writers.Meta' THEN 'writing.meta'
ELSE SUBSTRING(name, 15, 200)
END + '.stackexchange.com',
IIF(PATINDEX('StackOverflow.%', name) > 0,
CASE SUBSTRING(name, 15, 200)
WHEN 'Br' THEN 'pt'
WHEN 'Br.Meta' THEN 'pt.meta'
ELSE SUBSTRING(name, 15, 200)
END + '.stackoverflow.com',
IIF(PATINDEX('%.Meta', name) > 0,
'meta.' + SUBSTRING(name, 0, PATINDEX('%.Meta', name)) + '.com',
name + '.com'
)
)
)
)
) + '/'
)
FROM sys.databases WHERE database_id > 5

SQL Server split CSV into multiple rows

I realize this question has been asked before, but I can't get it to work for some reason.
I'm using the split function from this SQL Team thread (second post) and the following queries.
--This query converts the interests field from text to varchar
select
cp.id
,cast(cp.interests as varchar(100)) as interests
into #client_profile_temp
from
client_profile cp
--This query is supposed to split the csv ("Golf","food") into multiple rows
select
cpt.id
,split.data
from
#client_profile_temp cpt
cross apply dbo.split(
cpt.interests, ',') as split <--Error is on this line
However I'm getting an
Incorrect syntax near '.'
error where I've marked above.
In the end, I want
ID INTERESTS
000CT00002UA "Golf","food"
to be
ID INTERESTS
000CT00002UA "Golf"
000CT00002UA "food"
I'm using SQL Server 2008 and basing my answer on this StackOverflow question. I'm fairly new to SQL so any other words of wisdom would be appreciated as well.
TABLE
x-----------------x--------------------x
| ID | INTERESTS |
x-----------------x--------------------x
| 000CT00002UA | Golf,food |
| 000CT12303CB | Cricket,Bat |
x------x----------x--------------------x
METHOD 1 : Using XML format
SELECT ID,Split.a.value('.', 'VARCHAR(100)') 'INTERESTS'
FROM
(
-- To change ',' to any other delimeter, just change ',' before '</M><M>' to your desired one
SELECT ID, CAST ('<M>' + REPLACE(INTERESTS, ',', '</M><M>') + '</M>' AS XML) AS Data
FROM TEMP
) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)
SQL FIDDLE
METHOD 2 : Using function dbo.Split
SELECT a.ID, b.items
FROM #TEMP a
CROSS APPLY dbo.Split(a.INTERESTS, ',') b
SQL FIDDLE
And dbo.Split function is here.
CREATE FUNCTION [dbo].[Split](#String varchar(8000), #Delimiter char(1))
returns #temptable TABLE (items varchar(8000))
as
begin
declare #idx int
declare #slice varchar(8000)
select #idx = 1
if len(#String)<1 or #String is null return
while #idx!= 0
begin
set #idx = charindex(#Delimiter,#String)
if #idx!=0
set #slice = left(#String,#idx - 1)
else
set #slice = #String
if(len(#slice)>0)
insert into #temptable(Items) values(#slice)
set #String = right(#String,len(#String) - #idx)
if len(#String) = 0 break
end
return
end
FINAL RESULT
from
#client_profile_temp cpt
cross apply dbo.split(
#client_profile_temp.interests, ',') as split <--Error is on this line
I think the explicit naming of #client_profile_temp after you gave it an alias is a problem, try making that last line:
cpt.interests, ',') as split <--Error is on this line
EDIT You say
I made this change and it didn't change anything
Try pasting the code below (into a new SSMS window)
create table #client_profile_temp
(id int,
interests varchar(500))
insert into #client_profile_temp
values
(5, 'Vodka,Potassium,Trigo'),
(6, 'Mazda,Boeing,Alcoa')
select
cpt.id
,split.data
from
#client_profile_temp cpt
cross apply dbo.split(cpt.interests, ',') as split
See if it works as you expect; I'm using sql server 2008 and that works for me to get the kind of results I think you want.
Any chance when you say "I made the change", you just changed a stored procedure but haven't run it, or changed a script that creates a stored procedure, and haven't run that, something along those lines? As I say, it seems to work for me.
As this is old, it seems the following works in SQL Azure (as of 3/2022)
The big changes being split.value instead of .data or .items as shown above; no as after the function, and lastly string_split is the method.
select Id, split.value
from #reportTmp03 rpt
cross apply string_split(SelectedProductIds, ',') split
Try this:
--This query is supposed to split the csv ("Golf","food") into multiple rows
select
cpt.id
,split.data
from
#client_profile_temp cpt
cross apply dbo.split(cpt.interests, ',') as split <--Error is on this line
You must use table alias instead of table name as soon as you define it.

Resources