Related
I have a SQL Server 2008 R2 column containing a string which I need to split by a comma. I have seen many answers on StackOverflow but none of them works in R2. I have made sure I have select permissions on any split function examples. Any help greatly appreciated.
I've used this SQL before which may work for you:-
CREATE FUNCTION dbo.splitstring ( #stringToSplit VARCHAR(MAX) )
RETURNS
#returnList TABLE ([Name] [nvarchar] (500))
AS
BEGIN
DECLARE #name NVARCHAR(255)
DECLARE #pos INT
WHILE CHARINDEX(',', #stringToSplit) > 0
BEGIN
SELECT #pos = CHARINDEX(',', #stringToSplit)
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos-1)
INSERT INTO #returnList
SELECT #name
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, LEN(#stringToSplit)-#pos)
END
INSERT INTO #returnList
SELECT #stringToSplit
RETURN
END
and to use it:-
SELECT * FROM dbo.splitstring('91,12,65,78,56,789')
Instead of recursive CTEs and while loops, has anyone considered a more set-based approach? Note that this function was written for the question, which was based on SQL Server 2008 and comma as the delimiter. In SQL Server 2016 and above (and in compatibility level 130 and above), STRING_SPLIT() is a better option.
CREATE FUNCTION dbo.SplitString
(
#List nvarchar(max),
#Delim nvarchar(255)
)
RETURNS TABLE
AS
RETURN ( SELECT [Value] FROM
(
SELECT [Value] = LTRIM(RTRIM(SUBSTRING(#List, [Number],
CHARINDEX(#Delim, #List + #Delim, [Number]) - [Number])))
FROM (SELECT Number = ROW_NUMBER() OVER (ORDER BY name)
FROM sys.all_columns) AS x WHERE Number <= LEN(#List)
AND SUBSTRING(#Delim + #List, [Number], DATALENGTH(#Delim)/2) = #Delim
) AS y
);
GO
If you want to avoid the limitation of the length of the string being <= the number of rows in sys.all_columns (9,980 in model in SQL Server 2017; much higher in your own user databases), you can use other approaches for deriving the numbers, such as building your own table of numbers. You could also use a recursive CTE in cases where you can't use system tables or create your own:
CREATE FUNCTION dbo.SplitString
(
#List nvarchar(max),
#Delim nvarchar(255)
)
RETURNS TABLE WITH SCHEMABINDING
AS
RETURN ( WITH n(n) AS (SELECT 1 UNION ALL SELECT n+1
FROM n WHERE n <= LEN(#List))
SELECT [Value] = SUBSTRING(#List, n,
CHARINDEX(#Delim, #List + #Delim, n) - n)
FROM n WHERE n <= LEN(#List)
AND SUBSTRING(#Delim + #List, n, DATALENGTH(#Delim)/2) = #Delim
);
GO
But you'll have to append OPTION (MAXRECURSION 0) (or MAXRECURSION <longest possible string length if < 32768>) to the outer query in order to avoid errors with recursion for strings > 100 characters. If that is also not a good alternative then see this answer as pointed out in the comments, or this answer if you need an ordered split string function.
(Also, the delimiter will have to be NCHAR(<=1228). Still researching why.)
More on split functions, why (and proof that) while loops and recursive CTEs don't scale, and better alternatives, if you're splitting strings coming from the application layer:
Splitting strings
Finally the wait is over in SQL Server 2016 they have introduced Split string function : STRING_SPLIT
select * From STRING_SPLIT ('a,b', ',') cs
All the other methods to split string like XML, Tally table, while loop, etc.. has been blown away by this STRING_SPLIT function.
Here is an excellent article with performance comparison : Performance Surprises and Assumptions : STRING_SPLIT
The easiest way to do this is by using XML format.
1. Converting string to rows without table
QUERY
DECLARE #String varchar(100) = 'String1,String2,String3'
-- To change ',' to any other delimeter, just change ',' to your desired one
DECLARE #Delimiter CHAR = ','
SELECT LTRIM(RTRIM(Split.a.value('.', 'VARCHAR(100)'))) 'Value'
FROM
(
SELECT CAST ('<M>' + REPLACE(#String, #Delimiter, '</M><M>') + '</M>' AS XML) AS Data
) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)
RESULT
x---------x
| Value |
x---------x
| String1 |
| String2 |
| String3 |
x---------x
2. Converting to rows from a table which have an ID for each CSV row
SOURCE TABLE
x-----x--------------------------x
| Id | Value |
x-----x--------------------------x
| 1 | String1,String2,String3 |
| 2 | String4,String5,String6 |
x-----x--------------------------x
QUERY
-- To change ',' to any other delimeter, just change ',' before '</M><M>' to your desired one
DECLARE #Delimiter CHAR = ','
SELECT ID,LTRIM(RTRIM(Split.a.value('.', 'VARCHAR(100)'))) 'Value'
FROM
(
SELECT ID,CAST ('<M>' + REPLACE(VALUE, #Delimiter, '</M><M>') + '</M>' AS XML) AS Data
FROM TABLENAME
) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)
RESULT
x-----x----------x
| Id | Value |
x-----x----------x
| 1 | String1 |
| 1 | String2 |
| 1 | String3 |
| 2 | String4 |
| 2 | String5 |
| 2 | String6 |
x-----x----------x
I needed a quick way to get rid of the +4 from a zip code.
UPDATE #Emails
SET ZIPCode = SUBSTRING(ZIPCode, 1, (CHARINDEX('-', ZIPCODE)-1))
WHERE ZIPCode LIKE '%-%'
No proc... no UDF... just one tight little inline command that does what it must. Not fancy, not elegant.
Change the delimiter as needed, etc, and it will work for anything.
if you replace
WHILE CHARINDEX(',', #stringToSplit) > 0
with
WHILE LEN(#stringToSplit) > 0
you can eliminate that last insert after the while loop!
CREATE FUNCTION dbo.splitstring ( #stringToSplit VARCHAR(MAX) )
RETURNS
#returnList TABLE ([Name] [nvarchar] (500))
AS
BEGIN
DECLARE #name NVARCHAR(255)
DECLARE #pos INT
WHILE LEN(#stringToSplit) > 0
BEGIN
SELECT #pos = CHARINDEX(',', #stringToSplit)
if #pos = 0
SELECT #pos = LEN(#stringToSplit)
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos-1)
INSERT INTO #returnList
SELECT #name
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, LEN(#stringToSplit)-#pos)
END
RETURN
END
The often used approach with XML elements breaks in case of forbidden characters. This is an approach to use this method with any kind of character, even with the semicolon as delimiter.
The trick is, first to use SELECT SomeString AS [*] FOR XML PATH('') to get all forbidden characters properly escaped. That's the reason, why I replace the delimiter to a magic value to avoid troubles with ; as delimiter.
DECLARE #Dummy TABLE (ID INT, SomeTextToSplit NVARCHAR(MAX))
INSERT INTO #Dummy VALUES
(1,N'A&B;C;D;E, F')
,(2,N'"C" & ''D'';<C>;D;E, F');
DECLARE #Delimiter NVARCHAR(10)=';'; --special effort needed (due to entities coding with "&code;")!
WITH Casted AS
(
SELECT *
,CAST(N'<x>' + REPLACE((SELECT REPLACE(SomeTextToSplit,#Delimiter,N'§§Split$me$here§§') AS [*] FOR XML PATH('')),N'§§Split$me$here§§',N'</x><x>') + N'</x>' AS XML) AS SplitMe
FROM #Dummy
)
SELECT Casted.ID
,x.value(N'.',N'nvarchar(max)') AS Part
FROM Casted
CROSS APPLY SplitMe.nodes(N'/x') AS A(x)
The result
ID Part
1 A&B
1 C
1 D
1 E, F
2 "C" & 'D'
2 <C>
2 D
2 E, F
All the functions for string splitting that use some kind of Loop-ing (iterations) have bad performance. They should be replaced with set-based solution.
This code executes excellent.
CREATE FUNCTION dbo.SplitStrings
(
#List NVARCHAR(MAX),
#Delimiter NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
SELECT Item = y.i.value('(./text())[1]', 'nvarchar(4000)')
FROM
(
SELECT x = CONVERT(XML, '<i>'
+ REPLACE(#List, #Delimiter, '</i><i>')
+ '</i>').query('.')
) AS a CROSS APPLY x.nodes('i') AS y(i)
);
GO
I had to write something like this recently. Here's the solution I came up with. It's generalized for any delimiter string and I think it would perform slightly better:
CREATE FUNCTION [dbo].[SplitString]
( #string nvarchar(4000)
, #delim nvarchar(100) )
RETURNS
#result TABLE
( [Value] nvarchar(4000) NOT NULL
, [Index] int NOT NULL )
AS
BEGIN
DECLARE #str nvarchar(4000)
, #pos int
, #prv int = 1
SELECT #pos = CHARINDEX(#delim, #string)
WHILE #pos > 0
BEGIN
SELECT #str = SUBSTRING(#string, #prv, #pos - #prv)
INSERT INTO #result SELECT #str, #prv
SELECT #prv = #pos + LEN(#delim)
, #pos = CHARINDEX(#delim, #string, #pos + 1)
END
INSERT INTO #result SELECT SUBSTRING(#string, #prv, 4000), #prv
RETURN
END
If you need a quick ad-hoc solution for common cases with minimum code, then this recursive CTE two-liner will do it:
DECLARE #s VARCHAR(200) = ',1,2,,3,,,4,,,,5,'
;WITH
a AS (SELECT i=-1, j=0 UNION ALL SELECT j, CHARINDEX(',', #s, j + 1) FROM a WHERE j > i),
b AS (SELECT SUBSTRING(#s, i+1, IIF(j>0, j, LEN(#s)+1)-i-1) s FROM a WHERE i >= 0)
SELECT * FROM b
Either use this as a stand-alone statement or just add the above CTEs to any of your queries and you will be able to join the resulting table b with others for use in any further expressions.
edit (by Shnugo)
If you add a counter, you will get a position index together with the List:
DECLARE #s VARCHAR(200) = '1,2333,344,4'
;WITH
a AS (SELECT n=0, i=-1, j=0 UNION ALL SELECT n+1, j, CHARINDEX(',', #s, j+1) FROM a WHERE j > i),
b AS (SELECT n, SUBSTRING(#s, i+1, IIF(j>0, j, LEN(#s)+1)-i-1) s FROM a WHERE i >= 0)
SELECT * FROM b;
The result:
n s
1 1
2 2333
3 344
4 4
I take the xml route by wrapping the values into elements (M but anything works):
declare #v nvarchar(max) = '100,201,abcde'
select
a.value('.', 'varchar(max)')
from
(select cast('<M>' + REPLACE(#v, ',', '</M><M>') + '</M>' AS XML) as col) as A
CROSS APPLY A.col.nodes ('/M') AS Split(a)
A solution using a CTE, if anyone should need that (apart from me, who obviously did, that is why I wrote it).
declare #StringToSplit varchar(100) = 'Test1,Test2,Test3';
declare #SplitChar varchar(10) = ',';
with StringToSplit as (
select
ltrim( rtrim( substring( #StringToSplit, 1, charindex( #SplitChar, #StringToSplit ) - 1 ) ) ) Head
, substring( #StringToSplit, charindex( #SplitChar, #StringToSplit ) + 1, len( #StringToSplit ) ) Tail
union all
select
ltrim( rtrim( substring( Tail, 1, charindex( #SplitChar, Tail ) - 1 ) ) ) Head
, substring( Tail, charindex( #SplitChar, Tail ) + 1, len( Tail ) ) Tail
from StringToSplit
where charindex( #SplitChar, Tail ) > 0
union all
select
ltrim( rtrim( Tail ) ) Head
, '' Tail
from StringToSplit
where charindex( #SplitChar, Tail ) = 0
and len( Tail ) > 0
)
select Head from StringToSplit
This is more narrowly-tailored. When I do this I usually have a comma-delimited list of unique ids (INT or BIGINT), which I want to cast as a table to use as an inner join to another table that has a primary key of INT or BIGINT. I want an in-line table-valued function returned so that I have the most efficient join possible.
Sample usage would be:
DECLARE #IDs VARCHAR(1000);
SET #IDs = ',99,206,124,8967,1,7,3,45234,2,889,987979,';
SELECT me.Value
FROM dbo.MyEnum me
INNER JOIN dbo.GetIntIdsTableFromDelimitedString(#IDs) ids ON me.PrimaryKey = ids.ID
I stole the idea from http://sqlrecords.blogspot.com/2012/11/converting-delimited-list-to-table.html, changing it to be in-line table-valued and cast as INT.
create function dbo.GetIntIDTableFromDelimitedString
(
#IDs VARCHAR(1000) --this parameter must start and end with a comma, eg ',123,456,'
--all items in list must be perfectly formatted or function will error
)
RETURNS TABLE AS
RETURN
SELECT
CAST(SUBSTRING(#IDs,Nums.number + 1,CHARINDEX(',',#IDs,(Nums.number+2)) - Nums.number - 1) AS INT) AS ID
FROM
[master].[dbo].[spt_values] Nums
WHERE Nums.Type = 'P'
AND Nums.number BETWEEN 1 AND DATALENGTH(#IDs)
AND SUBSTRING(#IDs,Nums.number,1) = ','
AND CHARINDEX(',',#IDs,(Nums.number+1)) > Nums.number;
GO
There is a correct version on here but I thought it would be nice to add a little fault tolerance in case they have a trailing comma as well as make it so you could use it not as a function but as part of a larger piece of code. Just in case you're only using it once time and don't need a function. This is also for integers (which is what I needed it for) so you might have to change your data types.
DECLARE #StringToSeperate VARCHAR(10)
SET #StringToSeperate = '1,2,5'
--SELECT #StringToSeperate IDs INTO #Test
DROP TABLE #IDs
CREATE TABLE #IDs (ID int)
DECLARE #CommaSeperatedValue NVARCHAR(255) = ''
DECLARE #Position INT = LEN(#StringToSeperate)
--Add Each Value
WHILE CHARINDEX(',', #StringToSeperate) > 0
BEGIN
SELECT #Position = CHARINDEX(',', #StringToSeperate)
SELECT #CommaSeperatedValue = SUBSTRING(#StringToSeperate, 1, #Position-1)
INSERT INTO #IDs
SELECT #CommaSeperatedValue
SELECT #StringToSeperate = SUBSTRING(#StringToSeperate, #Position+1, LEN(#StringToSeperate)-#Position)
END
--Add Last Value
IF (LEN(LTRIM(RTRIM(#StringToSeperate)))>0)
BEGIN
INSERT INTO #IDs
SELECT SUBSTRING(#StringToSeperate, 1, #Position)
END
SELECT * FROM #IDs
I modified +Andy Robinson's function a little bit. Now you can select only required part from returning table:
CREATE FUNCTION dbo.splitstring ( #stringToSplit VARCHAR(MAX) )
RETURNS
#returnList TABLE ([numOrder] [tinyint] , [Name] [nvarchar] (500)) AS
BEGIN
DECLARE #name NVARCHAR(255)
DECLARE #pos INT
DECLARE #orderNum INT
SET #orderNum=0
WHILE CHARINDEX('.', #stringToSplit) > 0
BEGIN
SELECT #orderNum=#orderNum+1;
SELECT #pos = CHARINDEX('.', #stringToSplit)
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos-1)
INSERT INTO #returnList
SELECT #orderNum,#name
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, LEN(#stringToSplit)-#pos)
END
SELECT #orderNum=#orderNum+1;
INSERT INTO #returnList
SELECT #orderNum, #stringToSplit
RETURN
END
Usage:
SELECT Name FROM dbo.splitstring('ELIS.YD.CRP1.1.CBA.MDSP.T389.BT') WHERE numOrder=5
Simples
DECLARE #String varchar(100) = '11,21,84,85,87'
SELECT * FROM TB_PAPEL WHERE CD_PAPEL IN (SELECT value FROM STRING_SPLIT(#String, ','))
-- EQUIVALENTE
SELECT * FROM TB_PAPEL WHERE CD_PAPEL IN (11,21,84,85,87)
here is a version that can split on a pattern using patindex, a simple adaptation of the post above. I had a case where I needed to split a string that contained multiple separator chars.
alter FUNCTION dbo.splitstring ( #stringToSplit VARCHAR(1000), #splitPattern varchar(10) )
RETURNS
#returnList TABLE ([Name] [nvarchar] (500))
AS
BEGIN
DECLARE #name NVARCHAR(255)
DECLARE #pos INT
WHILE PATINDEX(#splitPattern, #stringToSplit) > 0
BEGIN
SELECT #pos = PATINDEX(#splitPattern, #stringToSplit)
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos-1)
INSERT INTO #returnList
SELECT #name
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, LEN(#stringToSplit)-#pos)
END
INSERT INTO #returnList
SELECT #stringToSplit
RETURN
END
select * from dbo.splitstring('stringa/stringb/x,y,z','%[/,]%');
result looks like this
stringa
stringb
x
y
z
Personnaly I use this function :
ALTER FUNCTION [dbo].[CUST_SplitString]
(
#String NVARCHAR(4000),
#Delimiter NCHAR(1)
)
RETURNS TABLE
AS
RETURN
(
WITH Split(stpos,endpos)
AS(
SELECT 0 AS stpos, CHARINDEX(#Delimiter,#String) AS endpos
UNION ALL
SELECT endpos+1, CHARINDEX(#Delimiter,#String,endpos+1)
FROM Split
WHERE endpos > 0
)
SELECT 'Id' = ROW_NUMBER() OVER (ORDER BY (SELECT 1)),
'Data' = SUBSTRING(#String,stpos,COALESCE(NULLIF(endpos,0),LEN(#String)+1)-stpos)
FROM Split
)
I have developed a double Splitter (Takes two split characters) as requested Here. Could be of some value in this thread seeing its the most referenced for queries relating to string splitting.
CREATE FUNCTION uft_DoubleSplitter
(
-- Add the parameters for the function here
#String VARCHAR(4000),
#Splitter1 CHAR,
#Splitter2 CHAR
)
RETURNS #Result TABLE (Id INT,MId INT,SValue VARCHAR(4000))
AS
BEGIN
DECLARE #FResult TABLE(Id INT IDENTITY(1, 1),
SValue VARCHAR(4000))
DECLARE #SResult TABLE(Id INT IDENTITY(1, 1),
MId INT,
SValue VARCHAR(4000))
SET #String = #String+#Splitter1
WHILE CHARINDEX(#Splitter1, #String) > 0
BEGIN
DECLARE #WorkingString VARCHAR(4000) = NULL
SET #WorkingString = SUBSTRING(#String, 1, CHARINDEX(#Splitter1, #String) - 1)
--Print #workingString
INSERT INTO #FResult
SELECT CASE
WHEN #WorkingString = '' THEN NULL
ELSE #WorkingString
END
SET #String = SUBSTRING(#String, LEN(#WorkingString) + 2, LEN(#String))
END
IF ISNULL(#Splitter2, '') != ''
BEGIN
DECLARE #OStartLoop INT
DECLARE #OEndLoop INT
SELECT #OStartLoop = MIN(Id),
#OEndLoop = MAX(Id)
FROM #FResult
WHILE #OStartLoop <= #OEndLoop
BEGIN
DECLARE #iString VARCHAR(4000)
DECLARE #iMId INT
SELECT #iString = SValue+#Splitter2,
#iMId = Id
FROM #FResult
WHERE Id = #OStartLoop
WHILE CHARINDEX(#Splitter2, #iString) > 0
BEGIN
DECLARE #iWorkingString VARCHAR(4000) = NULL
SET #IWorkingString = SUBSTRING(#iString, 1, CHARINDEX(#Splitter2, #iString) - 1)
INSERT INTO #SResult
SELECT #iMId,
CASE
WHEN #iWorkingString = '' THEN NULL
ELSE #iWorkingString
END
SET #iString = SUBSTRING(#iString, LEN(#iWorkingString) + 2, LEN(#iString))
END
SET #OStartLoop = #OStartLoop + 1
END
INSERT INTO #Result
SELECT MId AS PrimarySplitID,
ROW_NUMBER() OVER (PARTITION BY MId ORDER BY Mid, Id) AS SecondarySplitID ,
SValue
FROM #SResult
END
ELSE
BEGIN
INSERT INTO #Result
SELECT Id AS PrimarySplitID,
NULL AS SecondarySplitID,
SValue
FROM #FResult
END
RETURN
Usage:
--FirstSplit
SELECT * FROM uft_DoubleSplitter('ValueA=ValueB=ValueC=ValueD==ValueE&ValueA=ValueB=ValueC===ValueE&ValueA=ValueB==ValueD===','&',NULL)
--Second Split
SELECT * FROM uft_DoubleSplitter('ValueA=ValueB=ValueC=ValueD==ValueE&ValueA=ValueB=ValueC===ValueE&ValueA=ValueB==ValueD===','&','=')
Possible Usage (Get second value of each split):
SELECT fn.SValue
FROM uft_DoubleSplitter('ValueA=ValueB=ValueC=ValueD==ValueE&ValueA=ValueB=ValueC===ValueE&ValueA=ValueB==ValueD===', '&', '=')AS fn
WHERE fn.mid = 2
A recursive cte based solution
declare #T table (iden int identity, col1 varchar(100));
insert into #T(col1) values
('ROOT/South America/Lima/Test/Test2')
, ('ROOT/South America/Peru/Test/Test2')
, ('ROOT//South America/Venuzuala ')
, ('RtT/South America / ')
, ('ROOT/South Americas// ');
declare #split char(1) = '/';
select #split as split;
with cte as
( select t.iden, case when SUBSTRING(REVERSE(rtrim(t.col1)), 1, 1) = #split then LTRIM(RTRIM(t.col1)) else LTRIM(RTRIM(t.col1)) + #split end as col1, 0 as pos , 1 as cnt
from #T t
union all
select t.iden, t.col1 , charindex(#split, t.col1, t.pos + 1), cnt + 1
from cte t
where charindex(#split, t.col1, t.pos + 1) > 0
)
select t1.*, t2.pos, t2.cnt
, ltrim(rtrim(SUBSTRING(t1.col1, t1.pos+1, t2.pos-t1.pos-1))) as bingo
from cte t1
join cte t2
on t2.iden = t1.iden
and t2.cnt = t1.cnt+1
and t2.pos > t1.pos
order by t1.iden, t1.cnt;
With all due respect to #AviG this is the bug free version of function deviced by him to return all the tokens in full.
IF EXISTS (SELECT * FROM sys.objects WHERE type = 'TF' AND name = 'TF_SplitString')
DROP FUNCTION [dbo].[TF_SplitString]
GO
-- =============================================
-- Author: AviG
-- Amendments: Parameterize the delimeter and included the missing chars in last token - Gemunu Wickremasinghe
-- Description: Tabel valued function that Breaks the delimeted string by given delimeter and returns a tabel having split results
-- Usage
-- select * from [dbo].[TF_SplitString]('token1,token2,,,,,,,,token969',',')
-- 969 items should be returned
-- select * from [dbo].[TF_SplitString]('4672978261,4672978255',',')
-- 2 items should be returned
-- =============================================
CREATE FUNCTION dbo.TF_SplitString
( #stringToSplit VARCHAR(MAX) ,
#delimeter char = ','
)
RETURNS
#returnList TABLE ([Name] [nvarchar] (500))
AS
BEGIN
DECLARE #name NVARCHAR(255)
DECLARE #pos INT
WHILE LEN(#stringToSplit) > 0
BEGIN
SELECT #pos = CHARINDEX(#delimeter, #stringToSplit)
if #pos = 0
BEGIN
SELECT #pos = LEN(#stringToSplit)
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos)
END
else
BEGIN
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos-1)
END
INSERT INTO #returnList
SELECT #name
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, LEN(#stringToSplit)-#pos)
END
RETURN
END
This is based on Andy Robertson's answer, I needed a delimiter other than comma.
CREATE FUNCTION dbo.splitstring ( #stringToSplit nvarchar(MAX), #delim nvarchar(max))
RETURNS
#returnList TABLE ([value] [nvarchar] (MAX))
AS
BEGIN
DECLARE #value NVARCHAR(max)
DECLARE #pos INT
WHILE CHARINDEX(#delim, #stringToSplit) > 0
BEGIN
SELECT #pos = CHARINDEX(#delim, #stringToSplit)
SELECT #value = SUBSTRING(#stringToSplit, 1, #pos - 1)
INSERT INTO #returnList
SELECT #value
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos + LEN(#delim), LEN(#stringToSplit) - #pos)
END
INSERT INTO #returnList
SELECT #stringToSplit
RETURN
END
GO
And to use it:
SELECT * FROM dbo.splitstring('test1 test2 test3', ' ');
(Tested on SQL Server 2008 R2)
EDIT: correct test code
ALTER FUNCTION [dbo].func_split_string
(
#input as varchar(max),
#delimiter as varchar(10) = ";"
)
RETURNS #result TABLE
(
id smallint identity(1,1),
csv_value varchar(max) not null
)
AS
BEGIN
DECLARE #pos AS INT;
DECLARE #string AS VARCHAR(MAX) = '';
WHILE LEN(#input) > 0
BEGIN
SELECT #pos = CHARINDEX(#delimiter,#input);
IF(#pos<=0)
select #pos = len(#input)
IF(#pos <> LEN(#input))
SELECT #string = SUBSTRING(#input, 1, #pos-1);
ELSE
SELECT #string = SUBSTRING(#input, 1, #pos);
INSERT INTO #result SELECT #string
SELECT #input = SUBSTRING(#input, #pos+len(#delimiter), LEN(#input)-#pos)
END
RETURN
END
You can Use this function:
CREATE FUNCTION SplitString
(
#Input NVARCHAR(MAX),
#Character CHAR(1)
)
RETURNS #Output TABLE (
Item NVARCHAR(1000)
)
AS
BEGIN
DECLARE #StartIndex INT, #EndIndex INT
SET #StartIndex = 1
IF SUBSTRING(#Input, LEN(#Input) - 1, LEN(#Input)) <> #Character
BEGIN
SET #Input = #Input + #Character
END
WHILE CHARINDEX(#Character, #Input) > 0
BEGIN
SET #EndIndex = CHARINDEX(#Character, #Input)
INSERT INTO #Output(Item)
SELECT SUBSTRING(#Input, #StartIndex, #EndIndex - 1)
SET #Input = SUBSTRING(#Input, #EndIndex + 1, LEN(#Input))
END
RETURN
END
GO
Here is an example that you can use as function or also you can put the same logic in procedure.
--SELECT * from [dbo].fn_SplitString ;
CREATE FUNCTION [dbo].[fn_SplitString]
(#CSV VARCHAR(MAX), #Delimeter VARCHAR(100) = ',')
RETURNS #retTable TABLE
(
[value] VARCHAR(MAX) NULL
)AS
BEGIN
DECLARE
#vCSV VARCHAR (MAX) = #CSV,
#vDelimeter VARCHAR (100) = #Delimeter;
IF #vDelimeter = ';'
BEGIN
SET #vCSV = REPLACE(#vCSV, ';', '~!~#~');
SET #vDelimeter = REPLACE(#vDelimeter, ';', '~!~#~');
END;
SET #vCSV = REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(#vCSV, '&', '&'), '<', '<'), '>', '>'), '''', '''), '"', '"');
DECLARE #xml XML;
SET #xml = '<i>' + REPLACE(#vCSV, #vDelimeter, '</i><i>') + '</i>';
INSERT INTO #retTable
SELECT
x.i.value('.', 'varchar(max)') AS COLUMNNAME
FROM #xml.nodes('//i')AS x(i);
RETURN;
END;
/*
Answer to T-SQL split string
Based on answers from Andy Robinson and AviG
Enhanced functionality ref: LEN function not including trailing spaces in SQL Server
This 'file' should be valid as both a markdown file and an SQL file
*/
CREATE FUNCTION dbo.splitstring ( --CREATE OR ALTER
#stringToSplit NVARCHAR(MAX)
) RETURNS #returnList TABLE ([Item] NVARCHAR (MAX))
AS BEGIN
DECLARE #name NVARCHAR(MAX)
DECLARE #pos BIGINT
SET #stringToSplit = #stringToSplit + ',' -- this should allow entries that end with a `,` to have a blank value in that "column"
WHILE ((LEN(#stringToSplit+'_') > 1)) BEGIN -- `+'_'` gets around LEN trimming terminal spaces. See URL referenced above
SET #pos = COALESCE(NULLIF(CHARINDEX(',', #stringToSplit),0),LEN(#stringToSplit+'_')) -- COALESCE grabs first non-null value
SET #name = SUBSTRING(#stringToSplit, 1, #pos-1) --MAX size of string of type nvarchar is 4000
SET #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, 4000) -- With SUBSTRING fn (MS web): "If start is greater than the number of characters in the value expression, a zero-length expression is returned."
INSERT INTO #returnList SELECT #name --additional debugging parameters below can be added
-- + ' pos:' + CAST(#pos as nvarchar) + ' remain:''' + #stringToSplit + '''(' + CAST(LEN(#stringToSplit+'_')-1 as nvarchar) + ')'
END
RETURN
END
GO
/*
Test cases: see URL referenced as "enhanced functionality" above
SELECT *,LEN(Item+'_')-1 'L' from splitstring('a,,b')
Item | L
--- | ---
a | 1
| 0
b | 1
SELECT *,LEN(Item+'_')-1 'L' from splitstring('a,,')
Item | L
--- | ---
a | 1
| 0
| 0
SELECT *,LEN(Item+'_')-1 'L' from splitstring('a,, ')
Item | L
--- | ---
a | 1
| 0
| 1
SELECT *,LEN(Item+'_')-1 'L' from splitstring('a,, c ')
Item | L
--- | ---
a | 1
| 0
c | 3
*/
The easiest way:
Install SQL Server 2016
Use STRING_SPLIT https://msdn.microsoft.com/en-us/library/mt684588.aspx
It works even in express edition :).
I have a SQL Server 2008 R2 column containing a string which I need to split by a comma. I have seen many answers on StackOverflow but none of them works in R2. I have made sure I have select permissions on any split function examples. Any help greatly appreciated.
I've used this SQL before which may work for you:-
CREATE FUNCTION dbo.splitstring ( #stringToSplit VARCHAR(MAX) )
RETURNS
#returnList TABLE ([Name] [nvarchar] (500))
AS
BEGIN
DECLARE #name NVARCHAR(255)
DECLARE #pos INT
WHILE CHARINDEX(',', #stringToSplit) > 0
BEGIN
SELECT #pos = CHARINDEX(',', #stringToSplit)
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos-1)
INSERT INTO #returnList
SELECT #name
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, LEN(#stringToSplit)-#pos)
END
INSERT INTO #returnList
SELECT #stringToSplit
RETURN
END
and to use it:-
SELECT * FROM dbo.splitstring('91,12,65,78,56,789')
Instead of recursive CTEs and while loops, has anyone considered a more set-based approach? Note that this function was written for the question, which was based on SQL Server 2008 and comma as the delimiter. In SQL Server 2016 and above (and in compatibility level 130 and above), STRING_SPLIT() is a better option.
CREATE FUNCTION dbo.SplitString
(
#List nvarchar(max),
#Delim nvarchar(255)
)
RETURNS TABLE
AS
RETURN ( SELECT [Value] FROM
(
SELECT [Value] = LTRIM(RTRIM(SUBSTRING(#List, [Number],
CHARINDEX(#Delim, #List + #Delim, [Number]) - [Number])))
FROM (SELECT Number = ROW_NUMBER() OVER (ORDER BY name)
FROM sys.all_columns) AS x WHERE Number <= LEN(#List)
AND SUBSTRING(#Delim + #List, [Number], DATALENGTH(#Delim)/2) = #Delim
) AS y
);
GO
If you want to avoid the limitation of the length of the string being <= the number of rows in sys.all_columns (9,980 in model in SQL Server 2017; much higher in your own user databases), you can use other approaches for deriving the numbers, such as building your own table of numbers. You could also use a recursive CTE in cases where you can't use system tables or create your own:
CREATE FUNCTION dbo.SplitString
(
#List nvarchar(max),
#Delim nvarchar(255)
)
RETURNS TABLE WITH SCHEMABINDING
AS
RETURN ( WITH n(n) AS (SELECT 1 UNION ALL SELECT n+1
FROM n WHERE n <= LEN(#List))
SELECT [Value] = SUBSTRING(#List, n,
CHARINDEX(#Delim, #List + #Delim, n) - n)
FROM n WHERE n <= LEN(#List)
AND SUBSTRING(#Delim + #List, n, DATALENGTH(#Delim)/2) = #Delim
);
GO
But you'll have to append OPTION (MAXRECURSION 0) (or MAXRECURSION <longest possible string length if < 32768>) to the outer query in order to avoid errors with recursion for strings > 100 characters. If that is also not a good alternative then see this answer as pointed out in the comments, or this answer if you need an ordered split string function.
(Also, the delimiter will have to be NCHAR(<=1228). Still researching why.)
More on split functions, why (and proof that) while loops and recursive CTEs don't scale, and better alternatives, if you're splitting strings coming from the application layer:
Splitting strings
Finally the wait is over in SQL Server 2016 they have introduced Split string function : STRING_SPLIT
select * From STRING_SPLIT ('a,b', ',') cs
All the other methods to split string like XML, Tally table, while loop, etc.. has been blown away by this STRING_SPLIT function.
Here is an excellent article with performance comparison : Performance Surprises and Assumptions : STRING_SPLIT
The easiest way to do this is by using XML format.
1. Converting string to rows without table
QUERY
DECLARE #String varchar(100) = 'String1,String2,String3'
-- To change ',' to any other delimeter, just change ',' to your desired one
DECLARE #Delimiter CHAR = ','
SELECT LTRIM(RTRIM(Split.a.value('.', 'VARCHAR(100)'))) 'Value'
FROM
(
SELECT CAST ('<M>' + REPLACE(#String, #Delimiter, '</M><M>') + '</M>' AS XML) AS Data
) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)
RESULT
x---------x
| Value |
x---------x
| String1 |
| String2 |
| String3 |
x---------x
2. Converting to rows from a table which have an ID for each CSV row
SOURCE TABLE
x-----x--------------------------x
| Id | Value |
x-----x--------------------------x
| 1 | String1,String2,String3 |
| 2 | String4,String5,String6 |
x-----x--------------------------x
QUERY
-- To change ',' to any other delimeter, just change ',' before '</M><M>' to your desired one
DECLARE #Delimiter CHAR = ','
SELECT ID,LTRIM(RTRIM(Split.a.value('.', 'VARCHAR(100)'))) 'Value'
FROM
(
SELECT ID,CAST ('<M>' + REPLACE(VALUE, #Delimiter, '</M><M>') + '</M>' AS XML) AS Data
FROM TABLENAME
) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)
RESULT
x-----x----------x
| Id | Value |
x-----x----------x
| 1 | String1 |
| 1 | String2 |
| 1 | String3 |
| 2 | String4 |
| 2 | String5 |
| 2 | String6 |
x-----x----------x
I needed a quick way to get rid of the +4 from a zip code.
UPDATE #Emails
SET ZIPCode = SUBSTRING(ZIPCode, 1, (CHARINDEX('-', ZIPCODE)-1))
WHERE ZIPCode LIKE '%-%'
No proc... no UDF... just one tight little inline command that does what it must. Not fancy, not elegant.
Change the delimiter as needed, etc, and it will work for anything.
if you replace
WHILE CHARINDEX(',', #stringToSplit) > 0
with
WHILE LEN(#stringToSplit) > 0
you can eliminate that last insert after the while loop!
CREATE FUNCTION dbo.splitstring ( #stringToSplit VARCHAR(MAX) )
RETURNS
#returnList TABLE ([Name] [nvarchar] (500))
AS
BEGIN
DECLARE #name NVARCHAR(255)
DECLARE #pos INT
WHILE LEN(#stringToSplit) > 0
BEGIN
SELECT #pos = CHARINDEX(',', #stringToSplit)
if #pos = 0
SELECT #pos = LEN(#stringToSplit)
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos-1)
INSERT INTO #returnList
SELECT #name
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, LEN(#stringToSplit)-#pos)
END
RETURN
END
The often used approach with XML elements breaks in case of forbidden characters. This is an approach to use this method with any kind of character, even with the semicolon as delimiter.
The trick is, first to use SELECT SomeString AS [*] FOR XML PATH('') to get all forbidden characters properly escaped. That's the reason, why I replace the delimiter to a magic value to avoid troubles with ; as delimiter.
DECLARE #Dummy TABLE (ID INT, SomeTextToSplit NVARCHAR(MAX))
INSERT INTO #Dummy VALUES
(1,N'A&B;C;D;E, F')
,(2,N'"C" & ''D'';<C>;D;E, F');
DECLARE #Delimiter NVARCHAR(10)=';'; --special effort needed (due to entities coding with "&code;")!
WITH Casted AS
(
SELECT *
,CAST(N'<x>' + REPLACE((SELECT REPLACE(SomeTextToSplit,#Delimiter,N'§§Split$me$here§§') AS [*] FOR XML PATH('')),N'§§Split$me$here§§',N'</x><x>') + N'</x>' AS XML) AS SplitMe
FROM #Dummy
)
SELECT Casted.ID
,x.value(N'.',N'nvarchar(max)') AS Part
FROM Casted
CROSS APPLY SplitMe.nodes(N'/x') AS A(x)
The result
ID Part
1 A&B
1 C
1 D
1 E, F
2 "C" & 'D'
2 <C>
2 D
2 E, F
All the functions for string splitting that use some kind of Loop-ing (iterations) have bad performance. They should be replaced with set-based solution.
This code executes excellent.
CREATE FUNCTION dbo.SplitStrings
(
#List NVARCHAR(MAX),
#Delimiter NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
SELECT Item = y.i.value('(./text())[1]', 'nvarchar(4000)')
FROM
(
SELECT x = CONVERT(XML, '<i>'
+ REPLACE(#List, #Delimiter, '</i><i>')
+ '</i>').query('.')
) AS a CROSS APPLY x.nodes('i') AS y(i)
);
GO
I had to write something like this recently. Here's the solution I came up with. It's generalized for any delimiter string and I think it would perform slightly better:
CREATE FUNCTION [dbo].[SplitString]
( #string nvarchar(4000)
, #delim nvarchar(100) )
RETURNS
#result TABLE
( [Value] nvarchar(4000) NOT NULL
, [Index] int NOT NULL )
AS
BEGIN
DECLARE #str nvarchar(4000)
, #pos int
, #prv int = 1
SELECT #pos = CHARINDEX(#delim, #string)
WHILE #pos > 0
BEGIN
SELECT #str = SUBSTRING(#string, #prv, #pos - #prv)
INSERT INTO #result SELECT #str, #prv
SELECT #prv = #pos + LEN(#delim)
, #pos = CHARINDEX(#delim, #string, #pos + 1)
END
INSERT INTO #result SELECT SUBSTRING(#string, #prv, 4000), #prv
RETURN
END
If you need a quick ad-hoc solution for common cases with minimum code, then this recursive CTE two-liner will do it:
DECLARE #s VARCHAR(200) = ',1,2,,3,,,4,,,,5,'
;WITH
a AS (SELECT i=-1, j=0 UNION ALL SELECT j, CHARINDEX(',', #s, j + 1) FROM a WHERE j > i),
b AS (SELECT SUBSTRING(#s, i+1, IIF(j>0, j, LEN(#s)+1)-i-1) s FROM a WHERE i >= 0)
SELECT * FROM b
Either use this as a stand-alone statement or just add the above CTEs to any of your queries and you will be able to join the resulting table b with others for use in any further expressions.
edit (by Shnugo)
If you add a counter, you will get a position index together with the List:
DECLARE #s VARCHAR(200) = '1,2333,344,4'
;WITH
a AS (SELECT n=0, i=-1, j=0 UNION ALL SELECT n+1, j, CHARINDEX(',', #s, j+1) FROM a WHERE j > i),
b AS (SELECT n, SUBSTRING(#s, i+1, IIF(j>0, j, LEN(#s)+1)-i-1) s FROM a WHERE i >= 0)
SELECT * FROM b;
The result:
n s
1 1
2 2333
3 344
4 4
I take the xml route by wrapping the values into elements (M but anything works):
declare #v nvarchar(max) = '100,201,abcde'
select
a.value('.', 'varchar(max)')
from
(select cast('<M>' + REPLACE(#v, ',', '</M><M>') + '</M>' AS XML) as col) as A
CROSS APPLY A.col.nodes ('/M') AS Split(a)
A solution using a CTE, if anyone should need that (apart from me, who obviously did, that is why I wrote it).
declare #StringToSplit varchar(100) = 'Test1,Test2,Test3';
declare #SplitChar varchar(10) = ',';
with StringToSplit as (
select
ltrim( rtrim( substring( #StringToSplit, 1, charindex( #SplitChar, #StringToSplit ) - 1 ) ) ) Head
, substring( #StringToSplit, charindex( #SplitChar, #StringToSplit ) + 1, len( #StringToSplit ) ) Tail
union all
select
ltrim( rtrim( substring( Tail, 1, charindex( #SplitChar, Tail ) - 1 ) ) ) Head
, substring( Tail, charindex( #SplitChar, Tail ) + 1, len( Tail ) ) Tail
from StringToSplit
where charindex( #SplitChar, Tail ) > 0
union all
select
ltrim( rtrim( Tail ) ) Head
, '' Tail
from StringToSplit
where charindex( #SplitChar, Tail ) = 0
and len( Tail ) > 0
)
select Head from StringToSplit
This is more narrowly-tailored. When I do this I usually have a comma-delimited list of unique ids (INT or BIGINT), which I want to cast as a table to use as an inner join to another table that has a primary key of INT or BIGINT. I want an in-line table-valued function returned so that I have the most efficient join possible.
Sample usage would be:
DECLARE #IDs VARCHAR(1000);
SET #IDs = ',99,206,124,8967,1,7,3,45234,2,889,987979,';
SELECT me.Value
FROM dbo.MyEnum me
INNER JOIN dbo.GetIntIdsTableFromDelimitedString(#IDs) ids ON me.PrimaryKey = ids.ID
I stole the idea from http://sqlrecords.blogspot.com/2012/11/converting-delimited-list-to-table.html, changing it to be in-line table-valued and cast as INT.
create function dbo.GetIntIDTableFromDelimitedString
(
#IDs VARCHAR(1000) --this parameter must start and end with a comma, eg ',123,456,'
--all items in list must be perfectly formatted or function will error
)
RETURNS TABLE AS
RETURN
SELECT
CAST(SUBSTRING(#IDs,Nums.number + 1,CHARINDEX(',',#IDs,(Nums.number+2)) - Nums.number - 1) AS INT) AS ID
FROM
[master].[dbo].[spt_values] Nums
WHERE Nums.Type = 'P'
AND Nums.number BETWEEN 1 AND DATALENGTH(#IDs)
AND SUBSTRING(#IDs,Nums.number,1) = ','
AND CHARINDEX(',',#IDs,(Nums.number+1)) > Nums.number;
GO
There is a correct version on here but I thought it would be nice to add a little fault tolerance in case they have a trailing comma as well as make it so you could use it not as a function but as part of a larger piece of code. Just in case you're only using it once time and don't need a function. This is also for integers (which is what I needed it for) so you might have to change your data types.
DECLARE #StringToSeperate VARCHAR(10)
SET #StringToSeperate = '1,2,5'
--SELECT #StringToSeperate IDs INTO #Test
DROP TABLE #IDs
CREATE TABLE #IDs (ID int)
DECLARE #CommaSeperatedValue NVARCHAR(255) = ''
DECLARE #Position INT = LEN(#StringToSeperate)
--Add Each Value
WHILE CHARINDEX(',', #StringToSeperate) > 0
BEGIN
SELECT #Position = CHARINDEX(',', #StringToSeperate)
SELECT #CommaSeperatedValue = SUBSTRING(#StringToSeperate, 1, #Position-1)
INSERT INTO #IDs
SELECT #CommaSeperatedValue
SELECT #StringToSeperate = SUBSTRING(#StringToSeperate, #Position+1, LEN(#StringToSeperate)-#Position)
END
--Add Last Value
IF (LEN(LTRIM(RTRIM(#StringToSeperate)))>0)
BEGIN
INSERT INTO #IDs
SELECT SUBSTRING(#StringToSeperate, 1, #Position)
END
SELECT * FROM #IDs
I modified +Andy Robinson's function a little bit. Now you can select only required part from returning table:
CREATE FUNCTION dbo.splitstring ( #stringToSplit VARCHAR(MAX) )
RETURNS
#returnList TABLE ([numOrder] [tinyint] , [Name] [nvarchar] (500)) AS
BEGIN
DECLARE #name NVARCHAR(255)
DECLARE #pos INT
DECLARE #orderNum INT
SET #orderNum=0
WHILE CHARINDEX('.', #stringToSplit) > 0
BEGIN
SELECT #orderNum=#orderNum+1;
SELECT #pos = CHARINDEX('.', #stringToSplit)
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos-1)
INSERT INTO #returnList
SELECT #orderNum,#name
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, LEN(#stringToSplit)-#pos)
END
SELECT #orderNum=#orderNum+1;
INSERT INTO #returnList
SELECT #orderNum, #stringToSplit
RETURN
END
Usage:
SELECT Name FROM dbo.splitstring('ELIS.YD.CRP1.1.CBA.MDSP.T389.BT') WHERE numOrder=5
Simples
DECLARE #String varchar(100) = '11,21,84,85,87'
SELECT * FROM TB_PAPEL WHERE CD_PAPEL IN (SELECT value FROM STRING_SPLIT(#String, ','))
-- EQUIVALENTE
SELECT * FROM TB_PAPEL WHERE CD_PAPEL IN (11,21,84,85,87)
here is a version that can split on a pattern using patindex, a simple adaptation of the post above. I had a case where I needed to split a string that contained multiple separator chars.
alter FUNCTION dbo.splitstring ( #stringToSplit VARCHAR(1000), #splitPattern varchar(10) )
RETURNS
#returnList TABLE ([Name] [nvarchar] (500))
AS
BEGIN
DECLARE #name NVARCHAR(255)
DECLARE #pos INT
WHILE PATINDEX(#splitPattern, #stringToSplit) > 0
BEGIN
SELECT #pos = PATINDEX(#splitPattern, #stringToSplit)
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos-1)
INSERT INTO #returnList
SELECT #name
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, LEN(#stringToSplit)-#pos)
END
INSERT INTO #returnList
SELECT #stringToSplit
RETURN
END
select * from dbo.splitstring('stringa/stringb/x,y,z','%[/,]%');
result looks like this
stringa
stringb
x
y
z
Personnaly I use this function :
ALTER FUNCTION [dbo].[CUST_SplitString]
(
#String NVARCHAR(4000),
#Delimiter NCHAR(1)
)
RETURNS TABLE
AS
RETURN
(
WITH Split(stpos,endpos)
AS(
SELECT 0 AS stpos, CHARINDEX(#Delimiter,#String) AS endpos
UNION ALL
SELECT endpos+1, CHARINDEX(#Delimiter,#String,endpos+1)
FROM Split
WHERE endpos > 0
)
SELECT 'Id' = ROW_NUMBER() OVER (ORDER BY (SELECT 1)),
'Data' = SUBSTRING(#String,stpos,COALESCE(NULLIF(endpos,0),LEN(#String)+1)-stpos)
FROM Split
)
I have developed a double Splitter (Takes two split characters) as requested Here. Could be of some value in this thread seeing its the most referenced for queries relating to string splitting.
CREATE FUNCTION uft_DoubleSplitter
(
-- Add the parameters for the function here
#String VARCHAR(4000),
#Splitter1 CHAR,
#Splitter2 CHAR
)
RETURNS #Result TABLE (Id INT,MId INT,SValue VARCHAR(4000))
AS
BEGIN
DECLARE #FResult TABLE(Id INT IDENTITY(1, 1),
SValue VARCHAR(4000))
DECLARE #SResult TABLE(Id INT IDENTITY(1, 1),
MId INT,
SValue VARCHAR(4000))
SET #String = #String+#Splitter1
WHILE CHARINDEX(#Splitter1, #String) > 0
BEGIN
DECLARE #WorkingString VARCHAR(4000) = NULL
SET #WorkingString = SUBSTRING(#String, 1, CHARINDEX(#Splitter1, #String) - 1)
--Print #workingString
INSERT INTO #FResult
SELECT CASE
WHEN #WorkingString = '' THEN NULL
ELSE #WorkingString
END
SET #String = SUBSTRING(#String, LEN(#WorkingString) + 2, LEN(#String))
END
IF ISNULL(#Splitter2, '') != ''
BEGIN
DECLARE #OStartLoop INT
DECLARE #OEndLoop INT
SELECT #OStartLoop = MIN(Id),
#OEndLoop = MAX(Id)
FROM #FResult
WHILE #OStartLoop <= #OEndLoop
BEGIN
DECLARE #iString VARCHAR(4000)
DECLARE #iMId INT
SELECT #iString = SValue+#Splitter2,
#iMId = Id
FROM #FResult
WHERE Id = #OStartLoop
WHILE CHARINDEX(#Splitter2, #iString) > 0
BEGIN
DECLARE #iWorkingString VARCHAR(4000) = NULL
SET #IWorkingString = SUBSTRING(#iString, 1, CHARINDEX(#Splitter2, #iString) - 1)
INSERT INTO #SResult
SELECT #iMId,
CASE
WHEN #iWorkingString = '' THEN NULL
ELSE #iWorkingString
END
SET #iString = SUBSTRING(#iString, LEN(#iWorkingString) + 2, LEN(#iString))
END
SET #OStartLoop = #OStartLoop + 1
END
INSERT INTO #Result
SELECT MId AS PrimarySplitID,
ROW_NUMBER() OVER (PARTITION BY MId ORDER BY Mid, Id) AS SecondarySplitID ,
SValue
FROM #SResult
END
ELSE
BEGIN
INSERT INTO #Result
SELECT Id AS PrimarySplitID,
NULL AS SecondarySplitID,
SValue
FROM #FResult
END
RETURN
Usage:
--FirstSplit
SELECT * FROM uft_DoubleSplitter('ValueA=ValueB=ValueC=ValueD==ValueE&ValueA=ValueB=ValueC===ValueE&ValueA=ValueB==ValueD===','&',NULL)
--Second Split
SELECT * FROM uft_DoubleSplitter('ValueA=ValueB=ValueC=ValueD==ValueE&ValueA=ValueB=ValueC===ValueE&ValueA=ValueB==ValueD===','&','=')
Possible Usage (Get second value of each split):
SELECT fn.SValue
FROM uft_DoubleSplitter('ValueA=ValueB=ValueC=ValueD==ValueE&ValueA=ValueB=ValueC===ValueE&ValueA=ValueB==ValueD===', '&', '=')AS fn
WHERE fn.mid = 2
A recursive cte based solution
declare #T table (iden int identity, col1 varchar(100));
insert into #T(col1) values
('ROOT/South America/Lima/Test/Test2')
, ('ROOT/South America/Peru/Test/Test2')
, ('ROOT//South America/Venuzuala ')
, ('RtT/South America / ')
, ('ROOT/South Americas// ');
declare #split char(1) = '/';
select #split as split;
with cte as
( select t.iden, case when SUBSTRING(REVERSE(rtrim(t.col1)), 1, 1) = #split then LTRIM(RTRIM(t.col1)) else LTRIM(RTRIM(t.col1)) + #split end as col1, 0 as pos , 1 as cnt
from #T t
union all
select t.iden, t.col1 , charindex(#split, t.col1, t.pos + 1), cnt + 1
from cte t
where charindex(#split, t.col1, t.pos + 1) > 0
)
select t1.*, t2.pos, t2.cnt
, ltrim(rtrim(SUBSTRING(t1.col1, t1.pos+1, t2.pos-t1.pos-1))) as bingo
from cte t1
join cte t2
on t2.iden = t1.iden
and t2.cnt = t1.cnt+1
and t2.pos > t1.pos
order by t1.iden, t1.cnt;
With all due respect to #AviG this is the bug free version of function deviced by him to return all the tokens in full.
IF EXISTS (SELECT * FROM sys.objects WHERE type = 'TF' AND name = 'TF_SplitString')
DROP FUNCTION [dbo].[TF_SplitString]
GO
-- =============================================
-- Author: AviG
-- Amendments: Parameterize the delimeter and included the missing chars in last token - Gemunu Wickremasinghe
-- Description: Tabel valued function that Breaks the delimeted string by given delimeter and returns a tabel having split results
-- Usage
-- select * from [dbo].[TF_SplitString]('token1,token2,,,,,,,,token969',',')
-- 969 items should be returned
-- select * from [dbo].[TF_SplitString]('4672978261,4672978255',',')
-- 2 items should be returned
-- =============================================
CREATE FUNCTION dbo.TF_SplitString
( #stringToSplit VARCHAR(MAX) ,
#delimeter char = ','
)
RETURNS
#returnList TABLE ([Name] [nvarchar] (500))
AS
BEGIN
DECLARE #name NVARCHAR(255)
DECLARE #pos INT
WHILE LEN(#stringToSplit) > 0
BEGIN
SELECT #pos = CHARINDEX(#delimeter, #stringToSplit)
if #pos = 0
BEGIN
SELECT #pos = LEN(#stringToSplit)
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos)
END
else
BEGIN
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos-1)
END
INSERT INTO #returnList
SELECT #name
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, LEN(#stringToSplit)-#pos)
END
RETURN
END
This is based on Andy Robertson's answer, I needed a delimiter other than comma.
CREATE FUNCTION dbo.splitstring ( #stringToSplit nvarchar(MAX), #delim nvarchar(max))
RETURNS
#returnList TABLE ([value] [nvarchar] (MAX))
AS
BEGIN
DECLARE #value NVARCHAR(max)
DECLARE #pos INT
WHILE CHARINDEX(#delim, #stringToSplit) > 0
BEGIN
SELECT #pos = CHARINDEX(#delim, #stringToSplit)
SELECT #value = SUBSTRING(#stringToSplit, 1, #pos - 1)
INSERT INTO #returnList
SELECT #value
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos + LEN(#delim), LEN(#stringToSplit) - #pos)
END
INSERT INTO #returnList
SELECT #stringToSplit
RETURN
END
GO
And to use it:
SELECT * FROM dbo.splitstring('test1 test2 test3', ' ');
(Tested on SQL Server 2008 R2)
EDIT: correct test code
ALTER FUNCTION [dbo].func_split_string
(
#input as varchar(max),
#delimiter as varchar(10) = ";"
)
RETURNS #result TABLE
(
id smallint identity(1,1),
csv_value varchar(max) not null
)
AS
BEGIN
DECLARE #pos AS INT;
DECLARE #string AS VARCHAR(MAX) = '';
WHILE LEN(#input) > 0
BEGIN
SELECT #pos = CHARINDEX(#delimiter,#input);
IF(#pos<=0)
select #pos = len(#input)
IF(#pos <> LEN(#input))
SELECT #string = SUBSTRING(#input, 1, #pos-1);
ELSE
SELECT #string = SUBSTRING(#input, 1, #pos);
INSERT INTO #result SELECT #string
SELECT #input = SUBSTRING(#input, #pos+len(#delimiter), LEN(#input)-#pos)
END
RETURN
END
You can Use this function:
CREATE FUNCTION SplitString
(
#Input NVARCHAR(MAX),
#Character CHAR(1)
)
RETURNS #Output TABLE (
Item NVARCHAR(1000)
)
AS
BEGIN
DECLARE #StartIndex INT, #EndIndex INT
SET #StartIndex = 1
IF SUBSTRING(#Input, LEN(#Input) - 1, LEN(#Input)) <> #Character
BEGIN
SET #Input = #Input + #Character
END
WHILE CHARINDEX(#Character, #Input) > 0
BEGIN
SET #EndIndex = CHARINDEX(#Character, #Input)
INSERT INTO #Output(Item)
SELECT SUBSTRING(#Input, #StartIndex, #EndIndex - 1)
SET #Input = SUBSTRING(#Input, #EndIndex + 1, LEN(#Input))
END
RETURN
END
GO
Here is an example that you can use as function or also you can put the same logic in procedure.
--SELECT * from [dbo].fn_SplitString ;
CREATE FUNCTION [dbo].[fn_SplitString]
(#CSV VARCHAR(MAX), #Delimeter VARCHAR(100) = ',')
RETURNS #retTable TABLE
(
[value] VARCHAR(MAX) NULL
)AS
BEGIN
DECLARE
#vCSV VARCHAR (MAX) = #CSV,
#vDelimeter VARCHAR (100) = #Delimeter;
IF #vDelimeter = ';'
BEGIN
SET #vCSV = REPLACE(#vCSV, ';', '~!~#~');
SET #vDelimeter = REPLACE(#vDelimeter, ';', '~!~#~');
END;
SET #vCSV = REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(#vCSV, '&', '&'), '<', '<'), '>', '>'), '''', '''), '"', '"');
DECLARE #xml XML;
SET #xml = '<i>' + REPLACE(#vCSV, #vDelimeter, '</i><i>') + '</i>';
INSERT INTO #retTable
SELECT
x.i.value('.', 'varchar(max)') AS COLUMNNAME
FROM #xml.nodes('//i')AS x(i);
RETURN;
END;
/*
Answer to T-SQL split string
Based on answers from Andy Robinson and AviG
Enhanced functionality ref: LEN function not including trailing spaces in SQL Server
This 'file' should be valid as both a markdown file and an SQL file
*/
CREATE FUNCTION dbo.splitstring ( --CREATE OR ALTER
#stringToSplit NVARCHAR(MAX)
) RETURNS #returnList TABLE ([Item] NVARCHAR (MAX))
AS BEGIN
DECLARE #name NVARCHAR(MAX)
DECLARE #pos BIGINT
SET #stringToSplit = #stringToSplit + ',' -- this should allow entries that end with a `,` to have a blank value in that "column"
WHILE ((LEN(#stringToSplit+'_') > 1)) BEGIN -- `+'_'` gets around LEN trimming terminal spaces. See URL referenced above
SET #pos = COALESCE(NULLIF(CHARINDEX(',', #stringToSplit),0),LEN(#stringToSplit+'_')) -- COALESCE grabs first non-null value
SET #name = SUBSTRING(#stringToSplit, 1, #pos-1) --MAX size of string of type nvarchar is 4000
SET #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, 4000) -- With SUBSTRING fn (MS web): "If start is greater than the number of characters in the value expression, a zero-length expression is returned."
INSERT INTO #returnList SELECT #name --additional debugging parameters below can be added
-- + ' pos:' + CAST(#pos as nvarchar) + ' remain:''' + #stringToSplit + '''(' + CAST(LEN(#stringToSplit+'_')-1 as nvarchar) + ')'
END
RETURN
END
GO
/*
Test cases: see URL referenced as "enhanced functionality" above
SELECT *,LEN(Item+'_')-1 'L' from splitstring('a,,b')
Item | L
--- | ---
a | 1
| 0
b | 1
SELECT *,LEN(Item+'_')-1 'L' from splitstring('a,,')
Item | L
--- | ---
a | 1
| 0
| 0
SELECT *,LEN(Item+'_')-1 'L' from splitstring('a,, ')
Item | L
--- | ---
a | 1
| 0
| 1
SELECT *,LEN(Item+'_')-1 'L' from splitstring('a,, c ')
Item | L
--- | ---
a | 1
| 0
c | 3
*/
The easiest way:
Install SQL Server 2016
Use STRING_SPLIT https://msdn.microsoft.com/en-us/library/mt684588.aspx
It works even in express edition :).
I have a SQL Server 2008 R2 column containing a string which I need to split by a comma. I have seen many answers on StackOverflow but none of them works in R2. I have made sure I have select permissions on any split function examples. Any help greatly appreciated.
I've used this SQL before which may work for you:-
CREATE FUNCTION dbo.splitstring ( #stringToSplit VARCHAR(MAX) )
RETURNS
#returnList TABLE ([Name] [nvarchar] (500))
AS
BEGIN
DECLARE #name NVARCHAR(255)
DECLARE #pos INT
WHILE CHARINDEX(',', #stringToSplit) > 0
BEGIN
SELECT #pos = CHARINDEX(',', #stringToSplit)
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos-1)
INSERT INTO #returnList
SELECT #name
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, LEN(#stringToSplit)-#pos)
END
INSERT INTO #returnList
SELECT #stringToSplit
RETURN
END
and to use it:-
SELECT * FROM dbo.splitstring('91,12,65,78,56,789')
Instead of recursive CTEs and while loops, has anyone considered a more set-based approach? Note that this function was written for the question, which was based on SQL Server 2008 and comma as the delimiter. In SQL Server 2016 and above (and in compatibility level 130 and above), STRING_SPLIT() is a better option.
CREATE FUNCTION dbo.SplitString
(
#List nvarchar(max),
#Delim nvarchar(255)
)
RETURNS TABLE
AS
RETURN ( SELECT [Value] FROM
(
SELECT [Value] = LTRIM(RTRIM(SUBSTRING(#List, [Number],
CHARINDEX(#Delim, #List + #Delim, [Number]) - [Number])))
FROM (SELECT Number = ROW_NUMBER() OVER (ORDER BY name)
FROM sys.all_columns) AS x WHERE Number <= LEN(#List)
AND SUBSTRING(#Delim + #List, [Number], DATALENGTH(#Delim)/2) = #Delim
) AS y
);
GO
If you want to avoid the limitation of the length of the string being <= the number of rows in sys.all_columns (9,980 in model in SQL Server 2017; much higher in your own user databases), you can use other approaches for deriving the numbers, such as building your own table of numbers. You could also use a recursive CTE in cases where you can't use system tables or create your own:
CREATE FUNCTION dbo.SplitString
(
#List nvarchar(max),
#Delim nvarchar(255)
)
RETURNS TABLE WITH SCHEMABINDING
AS
RETURN ( WITH n(n) AS (SELECT 1 UNION ALL SELECT n+1
FROM n WHERE n <= LEN(#List))
SELECT [Value] = SUBSTRING(#List, n,
CHARINDEX(#Delim, #List + #Delim, n) - n)
FROM n WHERE n <= LEN(#List)
AND SUBSTRING(#Delim + #List, n, DATALENGTH(#Delim)/2) = #Delim
);
GO
But you'll have to append OPTION (MAXRECURSION 0) (or MAXRECURSION <longest possible string length if < 32768>) to the outer query in order to avoid errors with recursion for strings > 100 characters. If that is also not a good alternative then see this answer as pointed out in the comments, or this answer if you need an ordered split string function.
(Also, the delimiter will have to be NCHAR(<=1228). Still researching why.)
More on split functions, why (and proof that) while loops and recursive CTEs don't scale, and better alternatives, if you're splitting strings coming from the application layer:
Splitting strings
Finally the wait is over in SQL Server 2016 they have introduced Split string function : STRING_SPLIT
select * From STRING_SPLIT ('a,b', ',') cs
All the other methods to split string like XML, Tally table, while loop, etc.. has been blown away by this STRING_SPLIT function.
Here is an excellent article with performance comparison : Performance Surprises and Assumptions : STRING_SPLIT
The easiest way to do this is by using XML format.
1. Converting string to rows without table
QUERY
DECLARE #String varchar(100) = 'String1,String2,String3'
-- To change ',' to any other delimeter, just change ',' to your desired one
DECLARE #Delimiter CHAR = ','
SELECT LTRIM(RTRIM(Split.a.value('.', 'VARCHAR(100)'))) 'Value'
FROM
(
SELECT CAST ('<M>' + REPLACE(#String, #Delimiter, '</M><M>') + '</M>' AS XML) AS Data
) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)
RESULT
x---------x
| Value |
x---------x
| String1 |
| String2 |
| String3 |
x---------x
2. Converting to rows from a table which have an ID for each CSV row
SOURCE TABLE
x-----x--------------------------x
| Id | Value |
x-----x--------------------------x
| 1 | String1,String2,String3 |
| 2 | String4,String5,String6 |
x-----x--------------------------x
QUERY
-- To change ',' to any other delimeter, just change ',' before '</M><M>' to your desired one
DECLARE #Delimiter CHAR = ','
SELECT ID,LTRIM(RTRIM(Split.a.value('.', 'VARCHAR(100)'))) 'Value'
FROM
(
SELECT ID,CAST ('<M>' + REPLACE(VALUE, #Delimiter, '</M><M>') + '</M>' AS XML) AS Data
FROM TABLENAME
) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)
RESULT
x-----x----------x
| Id | Value |
x-----x----------x
| 1 | String1 |
| 1 | String2 |
| 1 | String3 |
| 2 | String4 |
| 2 | String5 |
| 2 | String6 |
x-----x----------x
I needed a quick way to get rid of the +4 from a zip code.
UPDATE #Emails
SET ZIPCode = SUBSTRING(ZIPCode, 1, (CHARINDEX('-', ZIPCODE)-1))
WHERE ZIPCode LIKE '%-%'
No proc... no UDF... just one tight little inline command that does what it must. Not fancy, not elegant.
Change the delimiter as needed, etc, and it will work for anything.
if you replace
WHILE CHARINDEX(',', #stringToSplit) > 0
with
WHILE LEN(#stringToSplit) > 0
you can eliminate that last insert after the while loop!
CREATE FUNCTION dbo.splitstring ( #stringToSplit VARCHAR(MAX) )
RETURNS
#returnList TABLE ([Name] [nvarchar] (500))
AS
BEGIN
DECLARE #name NVARCHAR(255)
DECLARE #pos INT
WHILE LEN(#stringToSplit) > 0
BEGIN
SELECT #pos = CHARINDEX(',', #stringToSplit)
if #pos = 0
SELECT #pos = LEN(#stringToSplit)
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos-1)
INSERT INTO #returnList
SELECT #name
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, LEN(#stringToSplit)-#pos)
END
RETURN
END
The often used approach with XML elements breaks in case of forbidden characters. This is an approach to use this method with any kind of character, even with the semicolon as delimiter.
The trick is, first to use SELECT SomeString AS [*] FOR XML PATH('') to get all forbidden characters properly escaped. That's the reason, why I replace the delimiter to a magic value to avoid troubles with ; as delimiter.
DECLARE #Dummy TABLE (ID INT, SomeTextToSplit NVARCHAR(MAX))
INSERT INTO #Dummy VALUES
(1,N'A&B;C;D;E, F')
,(2,N'"C" & ''D'';<C>;D;E, F');
DECLARE #Delimiter NVARCHAR(10)=';'; --special effort needed (due to entities coding with "&code;")!
WITH Casted AS
(
SELECT *
,CAST(N'<x>' + REPLACE((SELECT REPLACE(SomeTextToSplit,#Delimiter,N'§§Split$me$here§§') AS [*] FOR XML PATH('')),N'§§Split$me$here§§',N'</x><x>') + N'</x>' AS XML) AS SplitMe
FROM #Dummy
)
SELECT Casted.ID
,x.value(N'.',N'nvarchar(max)') AS Part
FROM Casted
CROSS APPLY SplitMe.nodes(N'/x') AS A(x)
The result
ID Part
1 A&B
1 C
1 D
1 E, F
2 "C" & 'D'
2 <C>
2 D
2 E, F
All the functions for string splitting that use some kind of Loop-ing (iterations) have bad performance. They should be replaced with set-based solution.
This code executes excellent.
CREATE FUNCTION dbo.SplitStrings
(
#List NVARCHAR(MAX),
#Delimiter NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
SELECT Item = y.i.value('(./text())[1]', 'nvarchar(4000)')
FROM
(
SELECT x = CONVERT(XML, '<i>'
+ REPLACE(#List, #Delimiter, '</i><i>')
+ '</i>').query('.')
) AS a CROSS APPLY x.nodes('i') AS y(i)
);
GO
I had to write something like this recently. Here's the solution I came up with. It's generalized for any delimiter string and I think it would perform slightly better:
CREATE FUNCTION [dbo].[SplitString]
( #string nvarchar(4000)
, #delim nvarchar(100) )
RETURNS
#result TABLE
( [Value] nvarchar(4000) NOT NULL
, [Index] int NOT NULL )
AS
BEGIN
DECLARE #str nvarchar(4000)
, #pos int
, #prv int = 1
SELECT #pos = CHARINDEX(#delim, #string)
WHILE #pos > 0
BEGIN
SELECT #str = SUBSTRING(#string, #prv, #pos - #prv)
INSERT INTO #result SELECT #str, #prv
SELECT #prv = #pos + LEN(#delim)
, #pos = CHARINDEX(#delim, #string, #pos + 1)
END
INSERT INTO #result SELECT SUBSTRING(#string, #prv, 4000), #prv
RETURN
END
If you need a quick ad-hoc solution for common cases with minimum code, then this recursive CTE two-liner will do it:
DECLARE #s VARCHAR(200) = ',1,2,,3,,,4,,,,5,'
;WITH
a AS (SELECT i=-1, j=0 UNION ALL SELECT j, CHARINDEX(',', #s, j + 1) FROM a WHERE j > i),
b AS (SELECT SUBSTRING(#s, i+1, IIF(j>0, j, LEN(#s)+1)-i-1) s FROM a WHERE i >= 0)
SELECT * FROM b
Either use this as a stand-alone statement or just add the above CTEs to any of your queries and you will be able to join the resulting table b with others for use in any further expressions.
edit (by Shnugo)
If you add a counter, you will get a position index together with the List:
DECLARE #s VARCHAR(200) = '1,2333,344,4'
;WITH
a AS (SELECT n=0, i=-1, j=0 UNION ALL SELECT n+1, j, CHARINDEX(',', #s, j+1) FROM a WHERE j > i),
b AS (SELECT n, SUBSTRING(#s, i+1, IIF(j>0, j, LEN(#s)+1)-i-1) s FROM a WHERE i >= 0)
SELECT * FROM b;
The result:
n s
1 1
2 2333
3 344
4 4
I take the xml route by wrapping the values into elements (M but anything works):
declare #v nvarchar(max) = '100,201,abcde'
select
a.value('.', 'varchar(max)')
from
(select cast('<M>' + REPLACE(#v, ',', '</M><M>') + '</M>' AS XML) as col) as A
CROSS APPLY A.col.nodes ('/M') AS Split(a)
A solution using a CTE, if anyone should need that (apart from me, who obviously did, that is why I wrote it).
declare #StringToSplit varchar(100) = 'Test1,Test2,Test3';
declare #SplitChar varchar(10) = ',';
with StringToSplit as (
select
ltrim( rtrim( substring( #StringToSplit, 1, charindex( #SplitChar, #StringToSplit ) - 1 ) ) ) Head
, substring( #StringToSplit, charindex( #SplitChar, #StringToSplit ) + 1, len( #StringToSplit ) ) Tail
union all
select
ltrim( rtrim( substring( Tail, 1, charindex( #SplitChar, Tail ) - 1 ) ) ) Head
, substring( Tail, charindex( #SplitChar, Tail ) + 1, len( Tail ) ) Tail
from StringToSplit
where charindex( #SplitChar, Tail ) > 0
union all
select
ltrim( rtrim( Tail ) ) Head
, '' Tail
from StringToSplit
where charindex( #SplitChar, Tail ) = 0
and len( Tail ) > 0
)
select Head from StringToSplit
This is more narrowly-tailored. When I do this I usually have a comma-delimited list of unique ids (INT or BIGINT), which I want to cast as a table to use as an inner join to another table that has a primary key of INT or BIGINT. I want an in-line table-valued function returned so that I have the most efficient join possible.
Sample usage would be:
DECLARE #IDs VARCHAR(1000);
SET #IDs = ',99,206,124,8967,1,7,3,45234,2,889,987979,';
SELECT me.Value
FROM dbo.MyEnum me
INNER JOIN dbo.GetIntIdsTableFromDelimitedString(#IDs) ids ON me.PrimaryKey = ids.ID
I stole the idea from http://sqlrecords.blogspot.com/2012/11/converting-delimited-list-to-table.html, changing it to be in-line table-valued and cast as INT.
create function dbo.GetIntIDTableFromDelimitedString
(
#IDs VARCHAR(1000) --this parameter must start and end with a comma, eg ',123,456,'
--all items in list must be perfectly formatted or function will error
)
RETURNS TABLE AS
RETURN
SELECT
CAST(SUBSTRING(#IDs,Nums.number + 1,CHARINDEX(',',#IDs,(Nums.number+2)) - Nums.number - 1) AS INT) AS ID
FROM
[master].[dbo].[spt_values] Nums
WHERE Nums.Type = 'P'
AND Nums.number BETWEEN 1 AND DATALENGTH(#IDs)
AND SUBSTRING(#IDs,Nums.number,1) = ','
AND CHARINDEX(',',#IDs,(Nums.number+1)) > Nums.number;
GO
There is a correct version on here but I thought it would be nice to add a little fault tolerance in case they have a trailing comma as well as make it so you could use it not as a function but as part of a larger piece of code. Just in case you're only using it once time and don't need a function. This is also for integers (which is what I needed it for) so you might have to change your data types.
DECLARE #StringToSeperate VARCHAR(10)
SET #StringToSeperate = '1,2,5'
--SELECT #StringToSeperate IDs INTO #Test
DROP TABLE #IDs
CREATE TABLE #IDs (ID int)
DECLARE #CommaSeperatedValue NVARCHAR(255) = ''
DECLARE #Position INT = LEN(#StringToSeperate)
--Add Each Value
WHILE CHARINDEX(',', #StringToSeperate) > 0
BEGIN
SELECT #Position = CHARINDEX(',', #StringToSeperate)
SELECT #CommaSeperatedValue = SUBSTRING(#StringToSeperate, 1, #Position-1)
INSERT INTO #IDs
SELECT #CommaSeperatedValue
SELECT #StringToSeperate = SUBSTRING(#StringToSeperate, #Position+1, LEN(#StringToSeperate)-#Position)
END
--Add Last Value
IF (LEN(LTRIM(RTRIM(#StringToSeperate)))>0)
BEGIN
INSERT INTO #IDs
SELECT SUBSTRING(#StringToSeperate, 1, #Position)
END
SELECT * FROM #IDs
I modified +Andy Robinson's function a little bit. Now you can select only required part from returning table:
CREATE FUNCTION dbo.splitstring ( #stringToSplit VARCHAR(MAX) )
RETURNS
#returnList TABLE ([numOrder] [tinyint] , [Name] [nvarchar] (500)) AS
BEGIN
DECLARE #name NVARCHAR(255)
DECLARE #pos INT
DECLARE #orderNum INT
SET #orderNum=0
WHILE CHARINDEX('.', #stringToSplit) > 0
BEGIN
SELECT #orderNum=#orderNum+1;
SELECT #pos = CHARINDEX('.', #stringToSplit)
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos-1)
INSERT INTO #returnList
SELECT #orderNum,#name
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, LEN(#stringToSplit)-#pos)
END
SELECT #orderNum=#orderNum+1;
INSERT INTO #returnList
SELECT #orderNum, #stringToSplit
RETURN
END
Usage:
SELECT Name FROM dbo.splitstring('ELIS.YD.CRP1.1.CBA.MDSP.T389.BT') WHERE numOrder=5
Simples
DECLARE #String varchar(100) = '11,21,84,85,87'
SELECT * FROM TB_PAPEL WHERE CD_PAPEL IN (SELECT value FROM STRING_SPLIT(#String, ','))
-- EQUIVALENTE
SELECT * FROM TB_PAPEL WHERE CD_PAPEL IN (11,21,84,85,87)
here is a version that can split on a pattern using patindex, a simple adaptation of the post above. I had a case where I needed to split a string that contained multiple separator chars.
alter FUNCTION dbo.splitstring ( #stringToSplit VARCHAR(1000), #splitPattern varchar(10) )
RETURNS
#returnList TABLE ([Name] [nvarchar] (500))
AS
BEGIN
DECLARE #name NVARCHAR(255)
DECLARE #pos INT
WHILE PATINDEX(#splitPattern, #stringToSplit) > 0
BEGIN
SELECT #pos = PATINDEX(#splitPattern, #stringToSplit)
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos-1)
INSERT INTO #returnList
SELECT #name
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, LEN(#stringToSplit)-#pos)
END
INSERT INTO #returnList
SELECT #stringToSplit
RETURN
END
select * from dbo.splitstring('stringa/stringb/x,y,z','%[/,]%');
result looks like this
stringa
stringb
x
y
z
Personnaly I use this function :
ALTER FUNCTION [dbo].[CUST_SplitString]
(
#String NVARCHAR(4000),
#Delimiter NCHAR(1)
)
RETURNS TABLE
AS
RETURN
(
WITH Split(stpos,endpos)
AS(
SELECT 0 AS stpos, CHARINDEX(#Delimiter,#String) AS endpos
UNION ALL
SELECT endpos+1, CHARINDEX(#Delimiter,#String,endpos+1)
FROM Split
WHERE endpos > 0
)
SELECT 'Id' = ROW_NUMBER() OVER (ORDER BY (SELECT 1)),
'Data' = SUBSTRING(#String,stpos,COALESCE(NULLIF(endpos,0),LEN(#String)+1)-stpos)
FROM Split
)
I have developed a double Splitter (Takes two split characters) as requested Here. Could be of some value in this thread seeing its the most referenced for queries relating to string splitting.
CREATE FUNCTION uft_DoubleSplitter
(
-- Add the parameters for the function here
#String VARCHAR(4000),
#Splitter1 CHAR,
#Splitter2 CHAR
)
RETURNS #Result TABLE (Id INT,MId INT,SValue VARCHAR(4000))
AS
BEGIN
DECLARE #FResult TABLE(Id INT IDENTITY(1, 1),
SValue VARCHAR(4000))
DECLARE #SResult TABLE(Id INT IDENTITY(1, 1),
MId INT,
SValue VARCHAR(4000))
SET #String = #String+#Splitter1
WHILE CHARINDEX(#Splitter1, #String) > 0
BEGIN
DECLARE #WorkingString VARCHAR(4000) = NULL
SET #WorkingString = SUBSTRING(#String, 1, CHARINDEX(#Splitter1, #String) - 1)
--Print #workingString
INSERT INTO #FResult
SELECT CASE
WHEN #WorkingString = '' THEN NULL
ELSE #WorkingString
END
SET #String = SUBSTRING(#String, LEN(#WorkingString) + 2, LEN(#String))
END
IF ISNULL(#Splitter2, '') != ''
BEGIN
DECLARE #OStartLoop INT
DECLARE #OEndLoop INT
SELECT #OStartLoop = MIN(Id),
#OEndLoop = MAX(Id)
FROM #FResult
WHILE #OStartLoop <= #OEndLoop
BEGIN
DECLARE #iString VARCHAR(4000)
DECLARE #iMId INT
SELECT #iString = SValue+#Splitter2,
#iMId = Id
FROM #FResult
WHERE Id = #OStartLoop
WHILE CHARINDEX(#Splitter2, #iString) > 0
BEGIN
DECLARE #iWorkingString VARCHAR(4000) = NULL
SET #IWorkingString = SUBSTRING(#iString, 1, CHARINDEX(#Splitter2, #iString) - 1)
INSERT INTO #SResult
SELECT #iMId,
CASE
WHEN #iWorkingString = '' THEN NULL
ELSE #iWorkingString
END
SET #iString = SUBSTRING(#iString, LEN(#iWorkingString) + 2, LEN(#iString))
END
SET #OStartLoop = #OStartLoop + 1
END
INSERT INTO #Result
SELECT MId AS PrimarySplitID,
ROW_NUMBER() OVER (PARTITION BY MId ORDER BY Mid, Id) AS SecondarySplitID ,
SValue
FROM #SResult
END
ELSE
BEGIN
INSERT INTO #Result
SELECT Id AS PrimarySplitID,
NULL AS SecondarySplitID,
SValue
FROM #FResult
END
RETURN
Usage:
--FirstSplit
SELECT * FROM uft_DoubleSplitter('ValueA=ValueB=ValueC=ValueD==ValueE&ValueA=ValueB=ValueC===ValueE&ValueA=ValueB==ValueD===','&',NULL)
--Second Split
SELECT * FROM uft_DoubleSplitter('ValueA=ValueB=ValueC=ValueD==ValueE&ValueA=ValueB=ValueC===ValueE&ValueA=ValueB==ValueD===','&','=')
Possible Usage (Get second value of each split):
SELECT fn.SValue
FROM uft_DoubleSplitter('ValueA=ValueB=ValueC=ValueD==ValueE&ValueA=ValueB=ValueC===ValueE&ValueA=ValueB==ValueD===', '&', '=')AS fn
WHERE fn.mid = 2
A recursive cte based solution
declare #T table (iden int identity, col1 varchar(100));
insert into #T(col1) values
('ROOT/South America/Lima/Test/Test2')
, ('ROOT/South America/Peru/Test/Test2')
, ('ROOT//South America/Venuzuala ')
, ('RtT/South America / ')
, ('ROOT/South Americas// ');
declare #split char(1) = '/';
select #split as split;
with cte as
( select t.iden, case when SUBSTRING(REVERSE(rtrim(t.col1)), 1, 1) = #split then LTRIM(RTRIM(t.col1)) else LTRIM(RTRIM(t.col1)) + #split end as col1, 0 as pos , 1 as cnt
from #T t
union all
select t.iden, t.col1 , charindex(#split, t.col1, t.pos + 1), cnt + 1
from cte t
where charindex(#split, t.col1, t.pos + 1) > 0
)
select t1.*, t2.pos, t2.cnt
, ltrim(rtrim(SUBSTRING(t1.col1, t1.pos+1, t2.pos-t1.pos-1))) as bingo
from cte t1
join cte t2
on t2.iden = t1.iden
and t2.cnt = t1.cnt+1
and t2.pos > t1.pos
order by t1.iden, t1.cnt;
With all due respect to #AviG this is the bug free version of function deviced by him to return all the tokens in full.
IF EXISTS (SELECT * FROM sys.objects WHERE type = 'TF' AND name = 'TF_SplitString')
DROP FUNCTION [dbo].[TF_SplitString]
GO
-- =============================================
-- Author: AviG
-- Amendments: Parameterize the delimeter and included the missing chars in last token - Gemunu Wickremasinghe
-- Description: Tabel valued function that Breaks the delimeted string by given delimeter and returns a tabel having split results
-- Usage
-- select * from [dbo].[TF_SplitString]('token1,token2,,,,,,,,token969',',')
-- 969 items should be returned
-- select * from [dbo].[TF_SplitString]('4672978261,4672978255',',')
-- 2 items should be returned
-- =============================================
CREATE FUNCTION dbo.TF_SplitString
( #stringToSplit VARCHAR(MAX) ,
#delimeter char = ','
)
RETURNS
#returnList TABLE ([Name] [nvarchar] (500))
AS
BEGIN
DECLARE #name NVARCHAR(255)
DECLARE #pos INT
WHILE LEN(#stringToSplit) > 0
BEGIN
SELECT #pos = CHARINDEX(#delimeter, #stringToSplit)
if #pos = 0
BEGIN
SELECT #pos = LEN(#stringToSplit)
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos)
END
else
BEGIN
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos-1)
END
INSERT INTO #returnList
SELECT #name
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, LEN(#stringToSplit)-#pos)
END
RETURN
END
This is based on Andy Robertson's answer, I needed a delimiter other than comma.
CREATE FUNCTION dbo.splitstring ( #stringToSplit nvarchar(MAX), #delim nvarchar(max))
RETURNS
#returnList TABLE ([value] [nvarchar] (MAX))
AS
BEGIN
DECLARE #value NVARCHAR(max)
DECLARE #pos INT
WHILE CHARINDEX(#delim, #stringToSplit) > 0
BEGIN
SELECT #pos = CHARINDEX(#delim, #stringToSplit)
SELECT #value = SUBSTRING(#stringToSplit, 1, #pos - 1)
INSERT INTO #returnList
SELECT #value
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos + LEN(#delim), LEN(#stringToSplit) - #pos)
END
INSERT INTO #returnList
SELECT #stringToSplit
RETURN
END
GO
And to use it:
SELECT * FROM dbo.splitstring('test1 test2 test3', ' ');
(Tested on SQL Server 2008 R2)
EDIT: correct test code
ALTER FUNCTION [dbo].func_split_string
(
#input as varchar(max),
#delimiter as varchar(10) = ";"
)
RETURNS #result TABLE
(
id smallint identity(1,1),
csv_value varchar(max) not null
)
AS
BEGIN
DECLARE #pos AS INT;
DECLARE #string AS VARCHAR(MAX) = '';
WHILE LEN(#input) > 0
BEGIN
SELECT #pos = CHARINDEX(#delimiter,#input);
IF(#pos<=0)
select #pos = len(#input)
IF(#pos <> LEN(#input))
SELECT #string = SUBSTRING(#input, 1, #pos-1);
ELSE
SELECT #string = SUBSTRING(#input, 1, #pos);
INSERT INTO #result SELECT #string
SELECT #input = SUBSTRING(#input, #pos+len(#delimiter), LEN(#input)-#pos)
END
RETURN
END
You can Use this function:
CREATE FUNCTION SplitString
(
#Input NVARCHAR(MAX),
#Character CHAR(1)
)
RETURNS #Output TABLE (
Item NVARCHAR(1000)
)
AS
BEGIN
DECLARE #StartIndex INT, #EndIndex INT
SET #StartIndex = 1
IF SUBSTRING(#Input, LEN(#Input) - 1, LEN(#Input)) <> #Character
BEGIN
SET #Input = #Input + #Character
END
WHILE CHARINDEX(#Character, #Input) > 0
BEGIN
SET #EndIndex = CHARINDEX(#Character, #Input)
INSERT INTO #Output(Item)
SELECT SUBSTRING(#Input, #StartIndex, #EndIndex - 1)
SET #Input = SUBSTRING(#Input, #EndIndex + 1, LEN(#Input))
END
RETURN
END
GO
Here is an example that you can use as function or also you can put the same logic in procedure.
--SELECT * from [dbo].fn_SplitString ;
CREATE FUNCTION [dbo].[fn_SplitString]
(#CSV VARCHAR(MAX), #Delimeter VARCHAR(100) = ',')
RETURNS #retTable TABLE
(
[value] VARCHAR(MAX) NULL
)AS
BEGIN
DECLARE
#vCSV VARCHAR (MAX) = #CSV,
#vDelimeter VARCHAR (100) = #Delimeter;
IF #vDelimeter = ';'
BEGIN
SET #vCSV = REPLACE(#vCSV, ';', '~!~#~');
SET #vDelimeter = REPLACE(#vDelimeter, ';', '~!~#~');
END;
SET #vCSV = REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(#vCSV, '&', '&'), '<', '<'), '>', '>'), '''', '''), '"', '"');
DECLARE #xml XML;
SET #xml = '<i>' + REPLACE(#vCSV, #vDelimeter, '</i><i>') + '</i>';
INSERT INTO #retTable
SELECT
x.i.value('.', 'varchar(max)') AS COLUMNNAME
FROM #xml.nodes('//i')AS x(i);
RETURN;
END;
/*
Answer to T-SQL split string
Based on answers from Andy Robinson and AviG
Enhanced functionality ref: LEN function not including trailing spaces in SQL Server
This 'file' should be valid as both a markdown file and an SQL file
*/
CREATE FUNCTION dbo.splitstring ( --CREATE OR ALTER
#stringToSplit NVARCHAR(MAX)
) RETURNS #returnList TABLE ([Item] NVARCHAR (MAX))
AS BEGIN
DECLARE #name NVARCHAR(MAX)
DECLARE #pos BIGINT
SET #stringToSplit = #stringToSplit + ',' -- this should allow entries that end with a `,` to have a blank value in that "column"
WHILE ((LEN(#stringToSplit+'_') > 1)) BEGIN -- `+'_'` gets around LEN trimming terminal spaces. See URL referenced above
SET #pos = COALESCE(NULLIF(CHARINDEX(',', #stringToSplit),0),LEN(#stringToSplit+'_')) -- COALESCE grabs first non-null value
SET #name = SUBSTRING(#stringToSplit, 1, #pos-1) --MAX size of string of type nvarchar is 4000
SET #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, 4000) -- With SUBSTRING fn (MS web): "If start is greater than the number of characters in the value expression, a zero-length expression is returned."
INSERT INTO #returnList SELECT #name --additional debugging parameters below can be added
-- + ' pos:' + CAST(#pos as nvarchar) + ' remain:''' + #stringToSplit + '''(' + CAST(LEN(#stringToSplit+'_')-1 as nvarchar) + ')'
END
RETURN
END
GO
/*
Test cases: see URL referenced as "enhanced functionality" above
SELECT *,LEN(Item+'_')-1 'L' from splitstring('a,,b')
Item | L
--- | ---
a | 1
| 0
b | 1
SELECT *,LEN(Item+'_')-1 'L' from splitstring('a,,')
Item | L
--- | ---
a | 1
| 0
| 0
SELECT *,LEN(Item+'_')-1 'L' from splitstring('a,, ')
Item | L
--- | ---
a | 1
| 0
| 1
SELECT *,LEN(Item+'_')-1 'L' from splitstring('a,, c ')
Item | L
--- | ---
a | 1
| 0
c | 3
*/
The easiest way:
Install SQL Server 2016
Use STRING_SPLIT https://msdn.microsoft.com/en-us/library/mt684588.aspx
It works even in express edition :).
I have a SQL Server 2008 R2 column containing a string which I need to split by a comma. I have seen many answers on StackOverflow but none of them works in R2. I have made sure I have select permissions on any split function examples. Any help greatly appreciated.
I've used this SQL before which may work for you:-
CREATE FUNCTION dbo.splitstring ( #stringToSplit VARCHAR(MAX) )
RETURNS
#returnList TABLE ([Name] [nvarchar] (500))
AS
BEGIN
DECLARE #name NVARCHAR(255)
DECLARE #pos INT
WHILE CHARINDEX(',', #stringToSplit) > 0
BEGIN
SELECT #pos = CHARINDEX(',', #stringToSplit)
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos-1)
INSERT INTO #returnList
SELECT #name
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, LEN(#stringToSplit)-#pos)
END
INSERT INTO #returnList
SELECT #stringToSplit
RETURN
END
and to use it:-
SELECT * FROM dbo.splitstring('91,12,65,78,56,789')
Instead of recursive CTEs and while loops, has anyone considered a more set-based approach? Note that this function was written for the question, which was based on SQL Server 2008 and comma as the delimiter. In SQL Server 2016 and above (and in compatibility level 130 and above), STRING_SPLIT() is a better option.
CREATE FUNCTION dbo.SplitString
(
#List nvarchar(max),
#Delim nvarchar(255)
)
RETURNS TABLE
AS
RETURN ( SELECT [Value] FROM
(
SELECT [Value] = LTRIM(RTRIM(SUBSTRING(#List, [Number],
CHARINDEX(#Delim, #List + #Delim, [Number]) - [Number])))
FROM (SELECT Number = ROW_NUMBER() OVER (ORDER BY name)
FROM sys.all_columns) AS x WHERE Number <= LEN(#List)
AND SUBSTRING(#Delim + #List, [Number], DATALENGTH(#Delim)/2) = #Delim
) AS y
);
GO
If you want to avoid the limitation of the length of the string being <= the number of rows in sys.all_columns (9,980 in model in SQL Server 2017; much higher in your own user databases), you can use other approaches for deriving the numbers, such as building your own table of numbers. You could also use a recursive CTE in cases where you can't use system tables or create your own:
CREATE FUNCTION dbo.SplitString
(
#List nvarchar(max),
#Delim nvarchar(255)
)
RETURNS TABLE WITH SCHEMABINDING
AS
RETURN ( WITH n(n) AS (SELECT 1 UNION ALL SELECT n+1
FROM n WHERE n <= LEN(#List))
SELECT [Value] = SUBSTRING(#List, n,
CHARINDEX(#Delim, #List + #Delim, n) - n)
FROM n WHERE n <= LEN(#List)
AND SUBSTRING(#Delim + #List, n, DATALENGTH(#Delim)/2) = #Delim
);
GO
But you'll have to append OPTION (MAXRECURSION 0) (or MAXRECURSION <longest possible string length if < 32768>) to the outer query in order to avoid errors with recursion for strings > 100 characters. If that is also not a good alternative then see this answer as pointed out in the comments, or this answer if you need an ordered split string function.
(Also, the delimiter will have to be NCHAR(<=1228). Still researching why.)
More on split functions, why (and proof that) while loops and recursive CTEs don't scale, and better alternatives, if you're splitting strings coming from the application layer:
Splitting strings
Finally the wait is over in SQL Server 2016 they have introduced Split string function : STRING_SPLIT
select * From STRING_SPLIT ('a,b', ',') cs
All the other methods to split string like XML, Tally table, while loop, etc.. has been blown away by this STRING_SPLIT function.
Here is an excellent article with performance comparison : Performance Surprises and Assumptions : STRING_SPLIT
The easiest way to do this is by using XML format.
1. Converting string to rows without table
QUERY
DECLARE #String varchar(100) = 'String1,String2,String3'
-- To change ',' to any other delimeter, just change ',' to your desired one
DECLARE #Delimiter CHAR = ','
SELECT LTRIM(RTRIM(Split.a.value('.', 'VARCHAR(100)'))) 'Value'
FROM
(
SELECT CAST ('<M>' + REPLACE(#String, #Delimiter, '</M><M>') + '</M>' AS XML) AS Data
) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)
RESULT
x---------x
| Value |
x---------x
| String1 |
| String2 |
| String3 |
x---------x
2. Converting to rows from a table which have an ID for each CSV row
SOURCE TABLE
x-----x--------------------------x
| Id | Value |
x-----x--------------------------x
| 1 | String1,String2,String3 |
| 2 | String4,String5,String6 |
x-----x--------------------------x
QUERY
-- To change ',' to any other delimeter, just change ',' before '</M><M>' to your desired one
DECLARE #Delimiter CHAR = ','
SELECT ID,LTRIM(RTRIM(Split.a.value('.', 'VARCHAR(100)'))) 'Value'
FROM
(
SELECT ID,CAST ('<M>' + REPLACE(VALUE, #Delimiter, '</M><M>') + '</M>' AS XML) AS Data
FROM TABLENAME
) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)
RESULT
x-----x----------x
| Id | Value |
x-----x----------x
| 1 | String1 |
| 1 | String2 |
| 1 | String3 |
| 2 | String4 |
| 2 | String5 |
| 2 | String6 |
x-----x----------x
I needed a quick way to get rid of the +4 from a zip code.
UPDATE #Emails
SET ZIPCode = SUBSTRING(ZIPCode, 1, (CHARINDEX('-', ZIPCODE)-1))
WHERE ZIPCode LIKE '%-%'
No proc... no UDF... just one tight little inline command that does what it must. Not fancy, not elegant.
Change the delimiter as needed, etc, and it will work for anything.
if you replace
WHILE CHARINDEX(',', #stringToSplit) > 0
with
WHILE LEN(#stringToSplit) > 0
you can eliminate that last insert after the while loop!
CREATE FUNCTION dbo.splitstring ( #stringToSplit VARCHAR(MAX) )
RETURNS
#returnList TABLE ([Name] [nvarchar] (500))
AS
BEGIN
DECLARE #name NVARCHAR(255)
DECLARE #pos INT
WHILE LEN(#stringToSplit) > 0
BEGIN
SELECT #pos = CHARINDEX(',', #stringToSplit)
if #pos = 0
SELECT #pos = LEN(#stringToSplit)
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos-1)
INSERT INTO #returnList
SELECT #name
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, LEN(#stringToSplit)-#pos)
END
RETURN
END
The often used approach with XML elements breaks in case of forbidden characters. This is an approach to use this method with any kind of character, even with the semicolon as delimiter.
The trick is, first to use SELECT SomeString AS [*] FOR XML PATH('') to get all forbidden characters properly escaped. That's the reason, why I replace the delimiter to a magic value to avoid troubles with ; as delimiter.
DECLARE #Dummy TABLE (ID INT, SomeTextToSplit NVARCHAR(MAX))
INSERT INTO #Dummy VALUES
(1,N'A&B;C;D;E, F')
,(2,N'"C" & ''D'';<C>;D;E, F');
DECLARE #Delimiter NVARCHAR(10)=';'; --special effort needed (due to entities coding with "&code;")!
WITH Casted AS
(
SELECT *
,CAST(N'<x>' + REPLACE((SELECT REPLACE(SomeTextToSplit,#Delimiter,N'§§Split$me$here§§') AS [*] FOR XML PATH('')),N'§§Split$me$here§§',N'</x><x>') + N'</x>' AS XML) AS SplitMe
FROM #Dummy
)
SELECT Casted.ID
,x.value(N'.',N'nvarchar(max)') AS Part
FROM Casted
CROSS APPLY SplitMe.nodes(N'/x') AS A(x)
The result
ID Part
1 A&B
1 C
1 D
1 E, F
2 "C" & 'D'
2 <C>
2 D
2 E, F
All the functions for string splitting that use some kind of Loop-ing (iterations) have bad performance. They should be replaced with set-based solution.
This code executes excellent.
CREATE FUNCTION dbo.SplitStrings
(
#List NVARCHAR(MAX),
#Delimiter NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
SELECT Item = y.i.value('(./text())[1]', 'nvarchar(4000)')
FROM
(
SELECT x = CONVERT(XML, '<i>'
+ REPLACE(#List, #Delimiter, '</i><i>')
+ '</i>').query('.')
) AS a CROSS APPLY x.nodes('i') AS y(i)
);
GO
I had to write something like this recently. Here's the solution I came up with. It's generalized for any delimiter string and I think it would perform slightly better:
CREATE FUNCTION [dbo].[SplitString]
( #string nvarchar(4000)
, #delim nvarchar(100) )
RETURNS
#result TABLE
( [Value] nvarchar(4000) NOT NULL
, [Index] int NOT NULL )
AS
BEGIN
DECLARE #str nvarchar(4000)
, #pos int
, #prv int = 1
SELECT #pos = CHARINDEX(#delim, #string)
WHILE #pos > 0
BEGIN
SELECT #str = SUBSTRING(#string, #prv, #pos - #prv)
INSERT INTO #result SELECT #str, #prv
SELECT #prv = #pos + LEN(#delim)
, #pos = CHARINDEX(#delim, #string, #pos + 1)
END
INSERT INTO #result SELECT SUBSTRING(#string, #prv, 4000), #prv
RETURN
END
If you need a quick ad-hoc solution for common cases with minimum code, then this recursive CTE two-liner will do it:
DECLARE #s VARCHAR(200) = ',1,2,,3,,,4,,,,5,'
;WITH
a AS (SELECT i=-1, j=0 UNION ALL SELECT j, CHARINDEX(',', #s, j + 1) FROM a WHERE j > i),
b AS (SELECT SUBSTRING(#s, i+1, IIF(j>0, j, LEN(#s)+1)-i-1) s FROM a WHERE i >= 0)
SELECT * FROM b
Either use this as a stand-alone statement or just add the above CTEs to any of your queries and you will be able to join the resulting table b with others for use in any further expressions.
edit (by Shnugo)
If you add a counter, you will get a position index together with the List:
DECLARE #s VARCHAR(200) = '1,2333,344,4'
;WITH
a AS (SELECT n=0, i=-1, j=0 UNION ALL SELECT n+1, j, CHARINDEX(',', #s, j+1) FROM a WHERE j > i),
b AS (SELECT n, SUBSTRING(#s, i+1, IIF(j>0, j, LEN(#s)+1)-i-1) s FROM a WHERE i >= 0)
SELECT * FROM b;
The result:
n s
1 1
2 2333
3 344
4 4
I take the xml route by wrapping the values into elements (M but anything works):
declare #v nvarchar(max) = '100,201,abcde'
select
a.value('.', 'varchar(max)')
from
(select cast('<M>' + REPLACE(#v, ',', '</M><M>') + '</M>' AS XML) as col) as A
CROSS APPLY A.col.nodes ('/M') AS Split(a)
A solution using a CTE, if anyone should need that (apart from me, who obviously did, that is why I wrote it).
declare #StringToSplit varchar(100) = 'Test1,Test2,Test3';
declare #SplitChar varchar(10) = ',';
with StringToSplit as (
select
ltrim( rtrim( substring( #StringToSplit, 1, charindex( #SplitChar, #StringToSplit ) - 1 ) ) ) Head
, substring( #StringToSplit, charindex( #SplitChar, #StringToSplit ) + 1, len( #StringToSplit ) ) Tail
union all
select
ltrim( rtrim( substring( Tail, 1, charindex( #SplitChar, Tail ) - 1 ) ) ) Head
, substring( Tail, charindex( #SplitChar, Tail ) + 1, len( Tail ) ) Tail
from StringToSplit
where charindex( #SplitChar, Tail ) > 0
union all
select
ltrim( rtrim( Tail ) ) Head
, '' Tail
from StringToSplit
where charindex( #SplitChar, Tail ) = 0
and len( Tail ) > 0
)
select Head from StringToSplit
This is more narrowly-tailored. When I do this I usually have a comma-delimited list of unique ids (INT or BIGINT), which I want to cast as a table to use as an inner join to another table that has a primary key of INT or BIGINT. I want an in-line table-valued function returned so that I have the most efficient join possible.
Sample usage would be:
DECLARE #IDs VARCHAR(1000);
SET #IDs = ',99,206,124,8967,1,7,3,45234,2,889,987979,';
SELECT me.Value
FROM dbo.MyEnum me
INNER JOIN dbo.GetIntIdsTableFromDelimitedString(#IDs) ids ON me.PrimaryKey = ids.ID
I stole the idea from http://sqlrecords.blogspot.com/2012/11/converting-delimited-list-to-table.html, changing it to be in-line table-valued and cast as INT.
create function dbo.GetIntIDTableFromDelimitedString
(
#IDs VARCHAR(1000) --this parameter must start and end with a comma, eg ',123,456,'
--all items in list must be perfectly formatted or function will error
)
RETURNS TABLE AS
RETURN
SELECT
CAST(SUBSTRING(#IDs,Nums.number + 1,CHARINDEX(',',#IDs,(Nums.number+2)) - Nums.number - 1) AS INT) AS ID
FROM
[master].[dbo].[spt_values] Nums
WHERE Nums.Type = 'P'
AND Nums.number BETWEEN 1 AND DATALENGTH(#IDs)
AND SUBSTRING(#IDs,Nums.number,1) = ','
AND CHARINDEX(',',#IDs,(Nums.number+1)) > Nums.number;
GO
There is a correct version on here but I thought it would be nice to add a little fault tolerance in case they have a trailing comma as well as make it so you could use it not as a function but as part of a larger piece of code. Just in case you're only using it once time and don't need a function. This is also for integers (which is what I needed it for) so you might have to change your data types.
DECLARE #StringToSeperate VARCHAR(10)
SET #StringToSeperate = '1,2,5'
--SELECT #StringToSeperate IDs INTO #Test
DROP TABLE #IDs
CREATE TABLE #IDs (ID int)
DECLARE #CommaSeperatedValue NVARCHAR(255) = ''
DECLARE #Position INT = LEN(#StringToSeperate)
--Add Each Value
WHILE CHARINDEX(',', #StringToSeperate) > 0
BEGIN
SELECT #Position = CHARINDEX(',', #StringToSeperate)
SELECT #CommaSeperatedValue = SUBSTRING(#StringToSeperate, 1, #Position-1)
INSERT INTO #IDs
SELECT #CommaSeperatedValue
SELECT #StringToSeperate = SUBSTRING(#StringToSeperate, #Position+1, LEN(#StringToSeperate)-#Position)
END
--Add Last Value
IF (LEN(LTRIM(RTRIM(#StringToSeperate)))>0)
BEGIN
INSERT INTO #IDs
SELECT SUBSTRING(#StringToSeperate, 1, #Position)
END
SELECT * FROM #IDs
I modified +Andy Robinson's function a little bit. Now you can select only required part from returning table:
CREATE FUNCTION dbo.splitstring ( #stringToSplit VARCHAR(MAX) )
RETURNS
#returnList TABLE ([numOrder] [tinyint] , [Name] [nvarchar] (500)) AS
BEGIN
DECLARE #name NVARCHAR(255)
DECLARE #pos INT
DECLARE #orderNum INT
SET #orderNum=0
WHILE CHARINDEX('.', #stringToSplit) > 0
BEGIN
SELECT #orderNum=#orderNum+1;
SELECT #pos = CHARINDEX('.', #stringToSplit)
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos-1)
INSERT INTO #returnList
SELECT #orderNum,#name
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, LEN(#stringToSplit)-#pos)
END
SELECT #orderNum=#orderNum+1;
INSERT INTO #returnList
SELECT #orderNum, #stringToSplit
RETURN
END
Usage:
SELECT Name FROM dbo.splitstring('ELIS.YD.CRP1.1.CBA.MDSP.T389.BT') WHERE numOrder=5
Simples
DECLARE #String varchar(100) = '11,21,84,85,87'
SELECT * FROM TB_PAPEL WHERE CD_PAPEL IN (SELECT value FROM STRING_SPLIT(#String, ','))
-- EQUIVALENTE
SELECT * FROM TB_PAPEL WHERE CD_PAPEL IN (11,21,84,85,87)
here is a version that can split on a pattern using patindex, a simple adaptation of the post above. I had a case where I needed to split a string that contained multiple separator chars.
alter FUNCTION dbo.splitstring ( #stringToSplit VARCHAR(1000), #splitPattern varchar(10) )
RETURNS
#returnList TABLE ([Name] [nvarchar] (500))
AS
BEGIN
DECLARE #name NVARCHAR(255)
DECLARE #pos INT
WHILE PATINDEX(#splitPattern, #stringToSplit) > 0
BEGIN
SELECT #pos = PATINDEX(#splitPattern, #stringToSplit)
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos-1)
INSERT INTO #returnList
SELECT #name
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, LEN(#stringToSplit)-#pos)
END
INSERT INTO #returnList
SELECT #stringToSplit
RETURN
END
select * from dbo.splitstring('stringa/stringb/x,y,z','%[/,]%');
result looks like this
stringa
stringb
x
y
z
Personnaly I use this function :
ALTER FUNCTION [dbo].[CUST_SplitString]
(
#String NVARCHAR(4000),
#Delimiter NCHAR(1)
)
RETURNS TABLE
AS
RETURN
(
WITH Split(stpos,endpos)
AS(
SELECT 0 AS stpos, CHARINDEX(#Delimiter,#String) AS endpos
UNION ALL
SELECT endpos+1, CHARINDEX(#Delimiter,#String,endpos+1)
FROM Split
WHERE endpos > 0
)
SELECT 'Id' = ROW_NUMBER() OVER (ORDER BY (SELECT 1)),
'Data' = SUBSTRING(#String,stpos,COALESCE(NULLIF(endpos,0),LEN(#String)+1)-stpos)
FROM Split
)
I have developed a double Splitter (Takes two split characters) as requested Here. Could be of some value in this thread seeing its the most referenced for queries relating to string splitting.
CREATE FUNCTION uft_DoubleSplitter
(
-- Add the parameters for the function here
#String VARCHAR(4000),
#Splitter1 CHAR,
#Splitter2 CHAR
)
RETURNS #Result TABLE (Id INT,MId INT,SValue VARCHAR(4000))
AS
BEGIN
DECLARE #FResult TABLE(Id INT IDENTITY(1, 1),
SValue VARCHAR(4000))
DECLARE #SResult TABLE(Id INT IDENTITY(1, 1),
MId INT,
SValue VARCHAR(4000))
SET #String = #String+#Splitter1
WHILE CHARINDEX(#Splitter1, #String) > 0
BEGIN
DECLARE #WorkingString VARCHAR(4000) = NULL
SET #WorkingString = SUBSTRING(#String, 1, CHARINDEX(#Splitter1, #String) - 1)
--Print #workingString
INSERT INTO #FResult
SELECT CASE
WHEN #WorkingString = '' THEN NULL
ELSE #WorkingString
END
SET #String = SUBSTRING(#String, LEN(#WorkingString) + 2, LEN(#String))
END
IF ISNULL(#Splitter2, '') != ''
BEGIN
DECLARE #OStartLoop INT
DECLARE #OEndLoop INT
SELECT #OStartLoop = MIN(Id),
#OEndLoop = MAX(Id)
FROM #FResult
WHILE #OStartLoop <= #OEndLoop
BEGIN
DECLARE #iString VARCHAR(4000)
DECLARE #iMId INT
SELECT #iString = SValue+#Splitter2,
#iMId = Id
FROM #FResult
WHERE Id = #OStartLoop
WHILE CHARINDEX(#Splitter2, #iString) > 0
BEGIN
DECLARE #iWorkingString VARCHAR(4000) = NULL
SET #IWorkingString = SUBSTRING(#iString, 1, CHARINDEX(#Splitter2, #iString) - 1)
INSERT INTO #SResult
SELECT #iMId,
CASE
WHEN #iWorkingString = '' THEN NULL
ELSE #iWorkingString
END
SET #iString = SUBSTRING(#iString, LEN(#iWorkingString) + 2, LEN(#iString))
END
SET #OStartLoop = #OStartLoop + 1
END
INSERT INTO #Result
SELECT MId AS PrimarySplitID,
ROW_NUMBER() OVER (PARTITION BY MId ORDER BY Mid, Id) AS SecondarySplitID ,
SValue
FROM #SResult
END
ELSE
BEGIN
INSERT INTO #Result
SELECT Id AS PrimarySplitID,
NULL AS SecondarySplitID,
SValue
FROM #FResult
END
RETURN
Usage:
--FirstSplit
SELECT * FROM uft_DoubleSplitter('ValueA=ValueB=ValueC=ValueD==ValueE&ValueA=ValueB=ValueC===ValueE&ValueA=ValueB==ValueD===','&',NULL)
--Second Split
SELECT * FROM uft_DoubleSplitter('ValueA=ValueB=ValueC=ValueD==ValueE&ValueA=ValueB=ValueC===ValueE&ValueA=ValueB==ValueD===','&','=')
Possible Usage (Get second value of each split):
SELECT fn.SValue
FROM uft_DoubleSplitter('ValueA=ValueB=ValueC=ValueD==ValueE&ValueA=ValueB=ValueC===ValueE&ValueA=ValueB==ValueD===', '&', '=')AS fn
WHERE fn.mid = 2
A recursive cte based solution
declare #T table (iden int identity, col1 varchar(100));
insert into #T(col1) values
('ROOT/South America/Lima/Test/Test2')
, ('ROOT/South America/Peru/Test/Test2')
, ('ROOT//South America/Venuzuala ')
, ('RtT/South America / ')
, ('ROOT/South Americas// ');
declare #split char(1) = '/';
select #split as split;
with cte as
( select t.iden, case when SUBSTRING(REVERSE(rtrim(t.col1)), 1, 1) = #split then LTRIM(RTRIM(t.col1)) else LTRIM(RTRIM(t.col1)) + #split end as col1, 0 as pos , 1 as cnt
from #T t
union all
select t.iden, t.col1 , charindex(#split, t.col1, t.pos + 1), cnt + 1
from cte t
where charindex(#split, t.col1, t.pos + 1) > 0
)
select t1.*, t2.pos, t2.cnt
, ltrim(rtrim(SUBSTRING(t1.col1, t1.pos+1, t2.pos-t1.pos-1))) as bingo
from cte t1
join cte t2
on t2.iden = t1.iden
and t2.cnt = t1.cnt+1
and t2.pos > t1.pos
order by t1.iden, t1.cnt;
With all due respect to #AviG this is the bug free version of function deviced by him to return all the tokens in full.
IF EXISTS (SELECT * FROM sys.objects WHERE type = 'TF' AND name = 'TF_SplitString')
DROP FUNCTION [dbo].[TF_SplitString]
GO
-- =============================================
-- Author: AviG
-- Amendments: Parameterize the delimeter and included the missing chars in last token - Gemunu Wickremasinghe
-- Description: Tabel valued function that Breaks the delimeted string by given delimeter and returns a tabel having split results
-- Usage
-- select * from [dbo].[TF_SplitString]('token1,token2,,,,,,,,token969',',')
-- 969 items should be returned
-- select * from [dbo].[TF_SplitString]('4672978261,4672978255',',')
-- 2 items should be returned
-- =============================================
CREATE FUNCTION dbo.TF_SplitString
( #stringToSplit VARCHAR(MAX) ,
#delimeter char = ','
)
RETURNS
#returnList TABLE ([Name] [nvarchar] (500))
AS
BEGIN
DECLARE #name NVARCHAR(255)
DECLARE #pos INT
WHILE LEN(#stringToSplit) > 0
BEGIN
SELECT #pos = CHARINDEX(#delimeter, #stringToSplit)
if #pos = 0
BEGIN
SELECT #pos = LEN(#stringToSplit)
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos)
END
else
BEGIN
SELECT #name = SUBSTRING(#stringToSplit, 1, #pos-1)
END
INSERT INTO #returnList
SELECT #name
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, LEN(#stringToSplit)-#pos)
END
RETURN
END
This is based on Andy Robertson's answer, I needed a delimiter other than comma.
CREATE FUNCTION dbo.splitstring ( #stringToSplit nvarchar(MAX), #delim nvarchar(max))
RETURNS
#returnList TABLE ([value] [nvarchar] (MAX))
AS
BEGIN
DECLARE #value NVARCHAR(max)
DECLARE #pos INT
WHILE CHARINDEX(#delim, #stringToSplit) > 0
BEGIN
SELECT #pos = CHARINDEX(#delim, #stringToSplit)
SELECT #value = SUBSTRING(#stringToSplit, 1, #pos - 1)
INSERT INTO #returnList
SELECT #value
SELECT #stringToSplit = SUBSTRING(#stringToSplit, #pos + LEN(#delim), LEN(#stringToSplit) - #pos)
END
INSERT INTO #returnList
SELECT #stringToSplit
RETURN
END
GO
And to use it:
SELECT * FROM dbo.splitstring('test1 test2 test3', ' ');
(Tested on SQL Server 2008 R2)
EDIT: correct test code
ALTER FUNCTION [dbo].func_split_string
(
#input as varchar(max),
#delimiter as varchar(10) = ";"
)
RETURNS #result TABLE
(
id smallint identity(1,1),
csv_value varchar(max) not null
)
AS
BEGIN
DECLARE #pos AS INT;
DECLARE #string AS VARCHAR(MAX) = '';
WHILE LEN(#input) > 0
BEGIN
SELECT #pos = CHARINDEX(#delimiter,#input);
IF(#pos<=0)
select #pos = len(#input)
IF(#pos <> LEN(#input))
SELECT #string = SUBSTRING(#input, 1, #pos-1);
ELSE
SELECT #string = SUBSTRING(#input, 1, #pos);
INSERT INTO #result SELECT #string
SELECT #input = SUBSTRING(#input, #pos+len(#delimiter), LEN(#input)-#pos)
END
RETURN
END
You can Use this function:
CREATE FUNCTION SplitString
(
#Input NVARCHAR(MAX),
#Character CHAR(1)
)
RETURNS #Output TABLE (
Item NVARCHAR(1000)
)
AS
BEGIN
DECLARE #StartIndex INT, #EndIndex INT
SET #StartIndex = 1
IF SUBSTRING(#Input, LEN(#Input) - 1, LEN(#Input)) <> #Character
BEGIN
SET #Input = #Input + #Character
END
WHILE CHARINDEX(#Character, #Input) > 0
BEGIN
SET #EndIndex = CHARINDEX(#Character, #Input)
INSERT INTO #Output(Item)
SELECT SUBSTRING(#Input, #StartIndex, #EndIndex - 1)
SET #Input = SUBSTRING(#Input, #EndIndex + 1, LEN(#Input))
END
RETURN
END
GO
Here is an example that you can use as function or also you can put the same logic in procedure.
--SELECT * from [dbo].fn_SplitString ;
CREATE FUNCTION [dbo].[fn_SplitString]
(#CSV VARCHAR(MAX), #Delimeter VARCHAR(100) = ',')
RETURNS #retTable TABLE
(
[value] VARCHAR(MAX) NULL
)AS
BEGIN
DECLARE
#vCSV VARCHAR (MAX) = #CSV,
#vDelimeter VARCHAR (100) = #Delimeter;
IF #vDelimeter = ';'
BEGIN
SET #vCSV = REPLACE(#vCSV, ';', '~!~#~');
SET #vDelimeter = REPLACE(#vDelimeter, ';', '~!~#~');
END;
SET #vCSV = REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(#vCSV, '&', '&'), '<', '<'), '>', '>'), '''', '''), '"', '"');
DECLARE #xml XML;
SET #xml = '<i>' + REPLACE(#vCSV, #vDelimeter, '</i><i>') + '</i>';
INSERT INTO #retTable
SELECT
x.i.value('.', 'varchar(max)') AS COLUMNNAME
FROM #xml.nodes('//i')AS x(i);
RETURN;
END;
/*
Answer to T-SQL split string
Based on answers from Andy Robinson and AviG
Enhanced functionality ref: LEN function not including trailing spaces in SQL Server
This 'file' should be valid as both a markdown file and an SQL file
*/
CREATE FUNCTION dbo.splitstring ( --CREATE OR ALTER
#stringToSplit NVARCHAR(MAX)
) RETURNS #returnList TABLE ([Item] NVARCHAR (MAX))
AS BEGIN
DECLARE #name NVARCHAR(MAX)
DECLARE #pos BIGINT
SET #stringToSplit = #stringToSplit + ',' -- this should allow entries that end with a `,` to have a blank value in that "column"
WHILE ((LEN(#stringToSplit+'_') > 1)) BEGIN -- `+'_'` gets around LEN trimming terminal spaces. See URL referenced above
SET #pos = COALESCE(NULLIF(CHARINDEX(',', #stringToSplit),0),LEN(#stringToSplit+'_')) -- COALESCE grabs first non-null value
SET #name = SUBSTRING(#stringToSplit, 1, #pos-1) --MAX size of string of type nvarchar is 4000
SET #stringToSplit = SUBSTRING(#stringToSplit, #pos+1, 4000) -- With SUBSTRING fn (MS web): "If start is greater than the number of characters in the value expression, a zero-length expression is returned."
INSERT INTO #returnList SELECT #name --additional debugging parameters below can be added
-- + ' pos:' + CAST(#pos as nvarchar) + ' remain:''' + #stringToSplit + '''(' + CAST(LEN(#stringToSplit+'_')-1 as nvarchar) + ')'
END
RETURN
END
GO
/*
Test cases: see URL referenced as "enhanced functionality" above
SELECT *,LEN(Item+'_')-1 'L' from splitstring('a,,b')
Item | L
--- | ---
a | 1
| 0
b | 1
SELECT *,LEN(Item+'_')-1 'L' from splitstring('a,,')
Item | L
--- | ---
a | 1
| 0
| 0
SELECT *,LEN(Item+'_')-1 'L' from splitstring('a,, ')
Item | L
--- | ---
a | 1
| 0
| 1
SELECT *,LEN(Item+'_')-1 'L' from splitstring('a,, c ')
Item | L
--- | ---
a | 1
| 0
c | 3
*/
The easiest way:
Install SQL Server 2016
Use STRING_SPLIT https://msdn.microsoft.com/en-us/library/mt684588.aspx
It works even in express edition :).
Is there an easy way of determining the most frequently occuring word in a column/field using T-SQL or VBA?
I am working on a fuzzy matching system for two given recordsets and would like to produce a matching string where the most frequently occuring words are removed. As the data is from a customer relations management database terms like "limited", "ltd", "plc" and "CORPORATION" would be removed.
Written for sql-server 2005+
Function to split:
create function f_split
(
#a varchar(max),
#delimiter varchar(20)
)
RETURNS #t TABLE(substr varchar(200))
as
begin
set #a = #a + #delimiter
;with a as
(
select cast(1 as bigint) f1, charindex(#delimiter, #a) f2
where len(#a) > 0
union all
select f2 + (len(#delimiter)) + 1, charindex(#delimiter, #a, f2+1)
from a
where f2 > 0
)
insert #t
select substring(#a, f1, f2 - f1) from a
where f1 < f2
return
end
go
Query:
--testdata
declare #table table(name varchar(50))
insert #table values('bla bla bla ltd')
insert #table values('bla plc ltd')
insert #table values('more text CORPORATION')
declare #matchlist table(name varchar(50), replacement varchar(50))
insert #matchlist values('ltd', 'limited')
insert #matchlist values('plc', 'limited')
insert #matchlist values('CORPORATION', 'limited')
--query
select coalesce(m.replacement, a.substr) name, count(*) count from #table p
cross apply
(
select substr from
dbo.f_split(p.name, ' ')
) a
left join
#matchlist m
on a.substr = m.name
group by coalesce(m.replacement, a.substr)
order by 2 desc
Result:
name count
---- -----
bla 4
limited 4
more 1
text 1
Hope this will be useful to you.
create table sometable
( id integer not null primary key identity
, mYWords text not null
);
insert into sometable (mYWords)
values ('a word that appears maximum number of times in a column')
insert into sometable (mYWords)
values ('Is it possible to get words from text columns in a sql server database')
insert into sometable (mYWords)
values ('This could solve my problem if reffered column contain only single word')
insert into sometable (mYWords)
values ('that''s going to require that you split out every word in the column individually')
insert into sometable (mYWords)
values ('the query will definitely not be easy to write')
insert into sometable (mYWords)
values ('Please read the sticky at the top of the board')
insert into sometable (mYWords)
values ('The physical order of data in a database has no meaning')
GO
CREATE TABLE WordList (
Word varchar(256)
, WordId int IDENTITY(1,1)
, Add_Dt datetime DEFAULT (GetDate()))
GO
CREATE UNIQUE INDEX UnqueWords_PK ON WordList(Word)
GO
CREATE PROC isp_INS_WORD_LIST
AS
BEGIN
SET NOCOUNT ON
DECLARE #Words INT, #Pos INT, #x Int, #str varchar(256)
, #word varchar(256), #start int, #end int, #exitstart int
SELECT #Words = 0, #Pos = 1, #x = -1, #Word = '', #start = 1
DECLARE Cur1 CURSOR FOR SELECT mYWords FROM sometable
OPEN Cur1
FETCH NEXT FROM Cur1 INTO #str
WHILE ##FETCH_STATUS = 0
BEGIN
WHILE (#x <> 0)
BEGIN
SET #x = CHARINDEX(' ', #str, #Pos)
IF #x <> 0
BEGIN
SET #end = #x - #start
SET #word = SUBSTRING(#str,#start,#end)
IF NOT EXISTS (SELECT * FROM WordList WHERE Word = #Word)
INSERT INTO WordList(Word) SELECT #word
-- SELECT #Word, ##ROWCOUNT,##ERROR
-- SELECT #x, #Word, #start, #end, #str
SET #exitstart = #start + #end + 1
SET #Pos = #x + 1
SET #start = #x + 1
SET #Words = #Words + 1
END
IF #x = 0
BEGIN
SET #word = SUBSTRING(#str,#exitstart,LEN(#str)-#exitstart+1)
IF NOT EXISTS (SELECT * FROM WordList WHERE Word = #Word)
INSERT INTO WordList(Word) SELECT #word
-- SELECT #Word, ##ROWCOUNT,##ERROR
-- SELECT #x, #Word, #exitstart, LEN(#str)-#exitstart, #str
END
END
FETCH NEXT FROM Cur1 INTO #str
SELECT #Words = 0, #Pos = 1, #x = -1, #Word = '', #start = 1
END
CLOSE Cur1
DEALLOCATE Cur1
SET NOCOUNT OFF
RETURN #Words
END
GO
EXEC isp_INS_WORD_LIST
GO
SELECT * FROM WordList ORDER BY Word
GO
DROP PROC isp_INS_WORD_LIST
DROP TABLE WordList, sometable
GO