I'd like to split comma-delimited strings in SQL Server 2012. I'm interested in an XML solution, not a function or while loop (performance and permissions reasons). I read this post: STRING_SPLIT in SQL Server 2012 which was helpful, however, my context is not splitting a variable but rather a column in a table. Below is an example of the kind of dataset I'm working with:
CREATE TABLE #EXAMPLE
(
ID INT,
LIST VARCHAR(1000)
)
INSERT INTO #EXAMPLE
VALUES (1, '12345,54321'), (2, '48965'), (3, '98765,45678,15935'), (4, '75315')
SELECT * FROM #EXAMPLE
DROP TABLE #EXAMPLE
Given that dataset, how could I go about splitting the LIST field on the comma so that I get this data set?
CREATE TABLE #EXAMPLE
(
ID INT,
LIST VARCHAR(1000)
)
INSERT INTO #EXAMPLE
VALUES (1, '12345'), (1, '54321'), (2, '48965'), (3, '98765'), (3, '45678'), (3, '15935'), (4, '75315')
SELECT * FROM #EXAMPLE
DROP TABLE #EXAMPLE
I feel like I'm blanking on implementing this with a table column as opposed to a variable, but I'm sure it's pretty similar. I'd be greatly appreciative of any input. Thanks!
If you want an XML solution the following should hopefully suffice.
Note - this is easily wrapped in a reusable table-valued function however you state you don't want a function so just using in-line.
select e.id, s.List
from #example e
cross apply (
select List = y.i.value('(./text())[1]', 'varchar(max)')
from (
select x = convert(xml, '<i>' + replace(e.list, ',', '</i><i>') + '</i>').query('.')
) as a cross apply x.nodes('i') as y(i)
)s
See working Fiddle
Taking into account your link, this can be done by slightly changing the query by adding Cross Apply.
Select e.ID, t.a
From #Example As e Cross Apply (
SELECT Split.a.value('.', 'NVARCHAR(MAX)') DATA
FROM
(
SELECT CAST('<X>'+REPLACE(e.List, ',', '</X><X>')+'</X>' AS XML) AS String
) AS A
CROSS APPLY String.nodes('/X') AS Split(a)) As t(a)
As #Charlieface already mentioned there is a risk to bump into XML entities: ampersand and the like.
That's why I always use a CDATA section for safety.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (ID INT, LIST VARCHAR(1000));
INSERT INTO #tbl VALUES
(1, '12345,<54321'),
(2, '48965'),
(3, '98765,45678,15935'),
(4, '75315');
-- DDL and sample data population, end
SELECT e.id, s.List
FROM #tbl e
CROSS APPLY (
SELECT List = y.i.value('(./text())[1]', 'VARCHAR(MAX)')
FROM (
SELECT x = TRY_CAST('<i><![CDATA[' + REPLACE(e.list, ',', ']]></i><i><![CDATA[') + ']]></i>' AS XML)
) AS a CROSS APPLY x.nodes('i') as y(i)
) AS s;
Output
+----+--------+
| id | List |
+----+--------+
| 1 | 12345 |
| 1 | <54321 |
| 2 | 48965 |
| 3 | 98765 |
| 3 | 45678 |
| 3 | 15935 |
| 4 | 75315 |
+----+--------+
Related
I have a table like this
CREATE TABLE Table1
([range] varchar(9), [sector] int)
;
INSERT INTO Table1
([range], [sector])
VALUES
('684-733', 2),
('563-598', 3),
('514-544', 2),
('640-682', 3),
('1053-1152', 2)
;
I want to get information by passing a predicate
So far I have this
select sector from table1 where [range] = 564
expected outcome
3
Is there any function I can use to get the data?
Please try the following solution.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (ID INT IDENTITY PRIMARY KEY, [range] varchar(9), [sector] int);
INSERT INTO #tbl ([range], [sector]) VALUES
('684-733', 2),
('563-598', 3),
('514-544', 2),
('640-682', 3),
('1053-1152', 2);
-- DDL and sample data population, end
DECLARE #param INT = 564;
;WITH rs AS
(
SELECT *
, LEFT([range], pos -1) AS [start]
, RIGHT([range], LEN([range]) - pos) AS [end]
FROM #tbl
CROSS APPLY (SELECT CHARINDEX('-', [range])) AS t(pos)
)
SELECT sector
FROM rs
WHERE #param BETWEEN [start] AND [end];
Output
+--------+
| sector |
+--------+
| 3 |
+--------+
Storing a number as a string like this could cause many issues later. Your best move is refactoring the table as follows:
CREATE TABLE Table1
([beginrange] int, [endrange] int, [sector] int)
Insert into table1 values
(684, 733, 2)
...
You would then get your result with:
select sector from table1 where 564 between [beginrange] and [endrange]
That said, if you do not have control over this table, you'll need to parse the string into two integers:
select * from table1 where 580 between convert(int, substring([range], 0, charindex('-', [range]))) and convert(int, substring([range], charindex('-', [range]) + 1, len([range])))
You can look up the various functions used here.
in SQL Server 2016 and later you can use string_split :
select sector
from table1 t1
cross apply string_split(t1.range, '-') y
group by sector
having 564 between min(y.value) and max(y.value)
db<>fiddle here
use a cross apply and a case statement to find a range where 564 is between the start and end values
select
tbl1.range,
case when 564 between Lookup.startValue and Lookup.endValue then
tbl1.sector
end Sector
from #tbl tbl1
cross apply
(
select
tbl2.range,
tbl2.sector,
cast(left(tbl2.range,charindex('-',tbl2.range)-1) as int) startValue,
cast(right(tbl2.range,len(tbl2.range)-charindex('-',tbl2.range)) as int) endValue
from #tbl tbl2
where tbl1.range=tbl2.range
)Lookup
where
case when 564 between Lookup.startValue and Lookup.endValue then
tbl1.sector
end is not null
output:
range Sector
563-598 3
I have data in the below format, I would like to find the users that match any and all of the words within the comma delimited skills column:
Name | id | skills |
-------------------------------------------------------
Bbarker | 5987 | Needles, Pins, Surgery, Word, Excel |
CJerald | 5988 | Bartender, Shots |
RSarah | 5600 | Pins, Ground, Hot, Coffee |
So if I am searching for "Needles, Pins", it should return Bbarker and RSarahs rows.
How would I achieve something like this using SQL ?
I dont even know where to begin or what to search for, any help in the right direction would be great!
Thanks!
Poor design aside, sometimes we are stuck and have to deal with that poor design.
I agree that if you have the option of redesigning I would pursue that route, in the meantime there are ways you can deal with delimited data.
If you are SQL Server version 2016+ there is a built in function call STRING_SLIT() that can be used.
If you are prior to SQL Server 2016 you basically have to convert to XML as a workaround
Here's a working example of both you can explore:
DECLARE #TestData TABLE
(
[Name] NVARCHAR(100)
, [Id] INT
, [skills] NVARCHAR(100)
);
--Test data
INSERT INTO #TestData (
[Name]
, [Id]
, [skills]
)
VALUES ( 'Bbarker', 5987, 'Needles, Pins, Surgery, Word, Excel' )
, ( 'CJerald', 5988, 'Bartender, Shots' )
, ( 'RSarah', 5600, 'Pins, Ground, Hot, Coffee' );
--search words
DECLARE #Search NVARCHAR(100) = 'Needles, Pins';
--sql server 2016+ using STING_SPLIT
SELECT DISTINCT [a].*
FROM #TestData [a]
CROSS APPLY STRING_SPLIT([a].[skills], ',') [sk] --split your column
CROSS APPLY STRING_SPLIT(#Search, ',') [srch] --split your search
WHERE LTRIM(RTRIM([sk].[value])) = LTRIM(RTRIM([srch].[value])); --filter where they equal
--Prior to sql server 2016, convert XML
SELECT DISTINCT [td].*
FROM #TestData [td]
--below we are converting to xml and then spliting those out for your column
CROSS APPLY (
SELECT [Split].[a].[value]('.', 'NVARCHAR(MAX)') [value]
FROM (
SELECT CAST('<X>' + REPLACE([td].[skills], ',', '</X><X>') + '</X>' AS XML) AS [String]
) AS [A]
CROSS APPLY [String].[nodes]('/X') AS [Split]([a])
) AS [sk]
--same here for the search
CROSS APPLY (
SELECT [Split].[a].[value]('.', 'NVARCHAR(MAX)') [value]
FROM (
SELECT CAST('<X>' + REPLACE(#Search, ',', '</X><X>') + '</X>' AS XML) AS [String]
) AS [A]
CROSS APPLY [String].[nodes]('/X') AS [Split]([a])
) AS [srch]
WHERE LTRIM(RTRIM([sk].[value])) = LTRIM(RTRIM([srch].[value])); --then as before where those are equal
Both will get you the output of:
Name Id skills
---------- ------- ------------------------------------
Bbarker 5987 Needles, Pins, Surgery, Word, Excel
RSarah 5600 Pins, Ground, Hot, Coffee
How about this?
SELECT DISTINCT Name, id
FROM table
WHERE skills LIKE '%Needles%'
OR skills LIKE '%Pins%'
I have multiple string in a column where I have get last string after column
Below are three example like same I have different number hyphen that can occur in a string but desired result is I have string before last hyphen
1. abc-def-Opto
2. abc-def-ijk-5C-hello-Opto
3. abc-def-ijk-4C-hi-Build
4. abc-def-ijk-4C-123-suppymanagement
Desired result set is
def
hello
hi
123
How to do this in SQL query to get this result set. I have MSSQL 2012 version
Require a generic sql which can get the result set
There are many ways to split/parse a string. ParseName() would fail because you may have more than 4 positions.
One option (just for fun), is to use a little XML.
We reverse the string
Convert into XML
Grab the second node
Reverse the desired value for the final presentation
Example
Declare #YourTable Table ([SomeCol] varchar(50))
Insert Into #YourTable Values
('abc-def-Opto')
,('abc-def-ijk-5C-hello-Opto')
,('abc-def-ijk-4C-hi-Build')
,('abc-def-ijk-4C-123-suppymanagement')
Select *
,Value = reverse(convert(xml,'<x>'+replace(reverse(SomeCol),'-','</x><x>')+'</x>').value('x[2]','varchar(150)'))
from #YourTable
Returns
SomeCol Value
abc-def-Opto def
abc-def-ijk-5C-hello-Opto hello
abc-def-ijk-4C-hi-Build hi
abc-def-ijk-4C-123-suppymanagement 123
Without getting into XML stuff, simply using string functions of sql server.
Declare #YourTable Table ([SomeCol] varchar(50))
Insert Into #YourTable Values
('abc-def-Opto')
,('abc-def-ijk-5C-hello-Opto')
,('abc-def-ijk-4C-hi-Build')
,('abc-def-ijk-4C-123-suppymanagement');
SELECT *
,RTRIM(LTRIM(REVERSE(
SUBSTRING(
SUBSTRING(REVERSE([SomeCol]) , CHARINDEX('-', REVERSE([SomeCol])) +1 , LEN([SomeCol]) )
, 1 , CHARINDEX('-', SUBSTRING(REVERSE([SomeCol]) , CHARINDEX('-', REVERSE([SomeCol])) +1 , LEN([SomeCol]) ) ) -1
)
)))
FROM #YourTable
i am not sure this script will exactly useful to your requirement but i am just trying to give an idea how to split the data
IF OBJECT_ID('tempdb..#Temp')IS NOT NULL
DROP TABLE #Temp
;WITH CTE(Id,data)
AS
(
SELECT 1,'abc-def-Opto' UNION ALL
SELECT 2,'abc-def-ijk-5C-hello-Opto' UNION ALL
SELECT 3,'abc-def-ijk-4C-hi-Build' UNION ALL
SELECT 4,'abc-def-ijk-4C-123-suppymanagement'
)
,Cte2
AS
(
SELECT Id, CASE WHEN Id=1 AND Setdata=1 THEN data
WHEN Id=2 AND Setdata=2 THEN data
WHEN Id=3 AND Setdata=3 THEN data
WHEN Id=4 AND Setdata=4 THEN data
ELSE NULL
END AS Data
FROM
(
SELECT Id,
Split.a.value('.','nvarchar(1000)') AS Data,
ROW_NUMBER()OVER(PARTITION BY id ORDER BY id) AS Setdata
FROM(
SELECT Id,
CAST('<S>'+REPLACE(data ,'-','</S><S>')+'</S>' AS XML) AS data
FROM CTE
) AS A
CROSS APPLY data.nodes('S') AS Split(a)
)dt
)
SELECT * INTO #Temp FROM Cte2
SELECT STUFF((SELECT DISTINCT ', '+ 'Set_'+CAST(Id AS VARCHAR(10))+':'+Data
FROM #Temp WHERE ISNULL(Data,'')<>'' FOR XML PATH ('')),1,1,'')
Result
Set_1:abc, Set_2:def, Set_3:ijk, Set_4:4C
You can do like
WITH CTE AS
(
SELECT 1 ID,'abc-def-Opto' Str
UNION
SELECT 2, 'abc-def-ijk-5C-hello-Opto'
UNION
SELECT 3, 'abc-def-ijk-4C-hi-Build'
UNION
SELECT 4, 'abc-def-ijk-4C-123-suppymanagement'
)
SELECT ID,
REVERSE(LEFT(REPLACE(P2, P1, ''), CHARINDEX('-', REPLACE(P2, P1, ''))-1)) Result
FROM (
SELECT LEFT(REVERSE(Str), CHARINDEX('-', REVERSE(Str))) P1,
REVERSE(Str) P2,
ID
FROM CTE
) T;
Returns:
+----+--------+
| ID | Result |
+----+--------+
| 1 | def |
| 2 | hello |
| 3 | hi |
| 4 | 123 |
+----+--------+
Demo
I have two tables and the values like this
`create table InputLocationTable(SKUID int,InputLocations varchar(100),Flag varchar(100))
create table Location(SKUID int,Locations varchar(100))
insert into InputLocationTable(SKUID,InputLocations) values(11,'Loc1, Loc2, Loc3, Loc4, Loc5, Loc6')
insert into InputLocationTable(SKUID,InputLocations) values(12,'Loc1, Loc2')
insert into InputLocationTable(SKUID,InputLocations) values(13,'Loc4,Loc5')
insert into Location(SKUID,Locations) values(11,'Loc3')
insert into Location(SKUID,Locations) values(11,'Loc4')
insert into Location(SKUID,Locations) values(11,'Loc5')
insert into Location(SKUID,Locations) values(11,'Loc7')
insert into Location(SKUID,Locations) values(12,'Loc10')
insert into Location(SKUID,Locations) values(12,'Loc1')
insert into Location(SKUID,Locations) values(12,'Loc5')
insert into Location(SKUID,Locations) values(13,'Loc4')
insert into Location(SKUID,Locations) values(13,'Loc2')
insert into Location(SKUID,Locations) values(13,'Loc2')`
I need to get the output by matching SKUID's from Each tables and Update the value in Flag column as shown in the screenshot, I have tried something like this code
`SELECT STUFF((select ','+ Data.C1
FROM
(select
n.r.value('.', 'varchar(50)') AS C1
from InputLocation as T
cross apply (select cast('<r>'+replace(replace(Location,'&','&'), ',', '</r><r>')+'</r>' as xml)) as S(XMLCol)
cross apply S.XMLCol.nodes('r') as n(r)) DATA
WHERE data.C1 NOT IN (SELECT Location
FROM Location) for xml path('')),1,1,'') As Output`
But not convinced with output and also i am trying to avoid xml path code, because performance is not first place for this code, I need the output like the below screenshot. Any help would be greatly appreciated.
I think you need to first look at why you think the XML approach is not performing well enough for your needs, as it has actually been shown to perform very well for larger input strings.
If you only need to handle input strings of up to either 4000 or 8000 characters (non max nvarchar and varchar types respectively), you can utilise a tally table contained within an inline table valued function which will also perform very well. The version I use can be found at the end of this post.
Utilising this function we can split out the values in your InputLocations column, though we still need to use for xml to concatenate them back together for your desired format:
-- Define data
declare #InputLocationTable table (SKUID int,InputLocations varchar(100),Flag varchar(100));
declare #Location table (SKUID int,Locations varchar(100));
insert into #InputLocationTable(SKUID,InputLocations) values (11,'Loc1, Loc2, Loc3, Loc4, Loc5, Loc6'),(12,'Loc1, Loc2'),(13,'Loc4,Loc5'),(14,'Loc1');
insert into #Location(SKUID,Locations) values (11,'Loc3'),(11,'Loc4'),(11,'Loc5'),(11,'Loc7'),(12,'Loc10'),(12,'Loc1'),(12,'Loc5'),(13,'Loc4'),(13,'Loc2'),(13,'Loc2'),(14,'Loc1');
--Query
-- Derived table splits out the values held within the InputLocations column
with i as
(
select i.SKUID
,i.InputLocations
,s.item as Loc
from #InputLocationTable as i
cross apply dbo.fn_StringSplit4k(replace(i.InputLocations,' ',''),',',null) as s
)
select il.SKUID
,il.InputLocations
,isnull('Add ' -- The split Locations are then matched to those already in #Location and those not present are concatenated together.
+ stuff((select ', ' + i.Loc
from i
left join #Location as l
on i.SKUID = l.SKUID
and i.Loc = l.Locations
where il.SKUID = i.SKUID
and l.SKUID is null
for xml path('')
)
,1,2,''
)
,'No Flag') as Flag
from #InputLocationTable as il
order by il.SKUID;
Output:
+-------+------------------------------------+----------------------+
| SKUID | InputLocations | Flag |
+-------+------------------------------------+----------------------+
| 11 | Loc1, Loc2, Loc3, Loc4, Loc5, Loc6 | Add Loc1, Loc2, Loc6 |
| 12 | Loc1, Loc2 | Add Loc2 |
| 13 | Loc4,Loc5 | Add Loc5 |
| 14 | Loc1 | No Flag |
+-------+------------------------------------+----------------------+
For nvarchar input (I have different functions for varchar and max type input) this is my version of the string splitting function linked above:
create function [dbo].[fn_StringSplit4k]
(
#str nvarchar(4000) = ' ' -- String to split.
,#delimiter as nvarchar(1) = ',' -- Delimiting value to split on.
,#num as int = null -- Which value in the list to return. NULL returns all.
)
returns table
as
return
-- Start tally table with 10 rows.
with n(n) as (select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1)
-- Select the same number of rows as characters in #str as incremental row numbers.
-- Cross joins increase exponentially to a max possible 10,000 rows to cover largest #str length.
,t(t) as (select top (select len(isnull(#str,'')) a) row_number() over (order by (select null)) from n n1,n n2,n n3,n n4)
-- Return the position of every value that follows the specified delimiter.
,s(s) as (select 1 union all select t+1 from t where substring(isnull(#str,''),t,1) = #delimiter)
-- Return the start and length of every value, to use in the SUBSTRING function.
-- ISNULL/NULLIF combo handles the last value where there is no delimiter at the end of the string.
,l(s,l) as (select s,isnull(nullif(charindex(#delimiter,isnull(#str,''),s),0)-s,4000) from s)
select rn
,item
from(select row_number() over(order by s) as rn
,substring(#str,s,l) as item
from l
) a
where rn = #num
or #num is null;
go
So my question is as follows:-
I have multiple strings with variable amounts of delimiters, the text between the delimiters can also vary in number:-
fug\klde\hzt\jkljlkjlkjl\hgftb\jghgf\ooorr\ter\fdgd
wegf\df\jght\kfd\dfgert
What I need to do is to cut the string and leave only the following from the examples:-
ooorr\ter\fdgd
jght\kfd\dfgert
so basically the third delimiter from the right side.
I have been able to use RIGHT, CHARINDEX and REVERSE to give me the last part of the string(s) but I am struggling for the rest.
Any help would be appreciated thanks in advance.
Text processing is something you might require to do in presentation layer but naive way is to do like below:
Select Substring(col1, len(col1) - CharIndex('\', reverse(col1), Charindex('\',reverse(col1),charindex('\', reverse(col1),1)+1)+1)+2, len(col1)) from #delimiterdata
Output as below:
+-----------------+
| Output |
+-----------------+
| ooorr\ter\fdgd |
| jght\kfd\dfgert |
+-----------------+
Another option is with a little XML in concert with a a Cross Apply
Example
Declare #YourTable table (ID int,SomeCol varchar(max))
Insert Into #YourTable values
(1,'fug\klde\hzt\jkljlkjlkjl\hgftb\jghgf\ooorr\ter\fdgd')
,(2,'wegf\df\jght\kfd\dfgert')
,(3,'kfd\dfgert')
Select A.ID
,NewValue = reverse(Concat(Pos1,'\'+Pos2,'\'+Pos3))
From #YourTable A
Cross Apply (
Select Pos1 = n.value('/x[1]','varchar(max)')
,Pos2 = n.value('/x[2]','varchar(max)')
,Pos3 = n.value('/x[3]','varchar(max)')
From (Select Cast('<x>' + replace((Select replace(reverse(A.SomeCol),'\','§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml) n) X
) B
Returns
ID NewValue
1 ooorr\ter\fdgd
2 jght\kfd\dfgert
3 kfd\dfgert --<<Note: Doesn't have 3 but will produce the last two
Using NGrams8K you could do this:
-- sample data
declare #yourtable table (someId int identity, someString varchar(1000));
insert #yourtable
values ('fug\klde\hzt\jkljlkjlkjl\hgftb\jghgf\ooorr\ter\fdgd'),('wegf\df\jght\kfd\dfgert');
with stringPrep AS
(
select
someId,
someString,
dPos = ROW_NUMBER() OVER (partition by t.someid order by ng.position desc),
position
from #yourtable t
cross apply dbo.NGrams8k(t.someString, 1) ng
where token = '\'
)
select
someId,
someString,
newString = substring(someString, position+1, 1000)
from stringPrep
where dpos = 3;
Results
someId someString newString
------- ------------------------------------------------------------ -----------------
1 fug\klde\hzt\jkljlkjlkjl\hgftb\jghgf\ooorr\ter\fdgd ooorr\ter\fdgd
2 wegf\df\jght\kfd\dfgert jght\kfd\dfgert