Efficient Cleaning of Strings in a Table - sql-server

I'm currently working on a problem where certain characters need to be cleaned from strings that exist in a table. Normally I'd do a simple UPDATE with a replace, but in this case there are 32 different characters that need to be removed.
I've done some looking around and I can't find any great solutions for quickly cleaning strings that already exist in a table.
Things I've looked into:
Doing a series of nested replaces
This solution is do-able, but for 32 different replaces it would require either some ugly code, or hacky dynamic sql to build a huge series of replaces.
PATINDEX and while loops
As seen in this answer it is possible to mimic a kind of regex replace, but I'm working with a lot of data so I'm hesitant to even trust the improved solution to run in a reasonable amount of time when the data volume is large.
Recursive CTEs
I tried a CTE approuch to this problem, but it didn't run terribly fast once the number of rows got large.
For reference:
CREATE TABLE #BadChar(
id int IDENTITY(1,1),
badString nvarchar(10),
replaceString nvarchar(10)
);
INSERT INTO #BadChar(badString, replaceString) SELECT 'A', '^';
INSERT INTO #BadChar(badString, replaceString) SELECT 'B', '}';
INSERT INTO #BadChar(badString, replaceString) SELECT 's', '5';
INSERT INTO #BadChar(badString, replaceString) SELECT '-', ' ';
CREATE TABLE #CleanMe(
clean_id int IDENTITY(1,1),
DirtyString nvarchar(20)
);
DECLARE #i int;
SET #i = 0;
WHILE #i < 100000 BEGIN
INSERT INTO #CleanMe(DirtyString) SELECT 'AAAAA';
INSERT INTO #CleanMe(DirtyString) SELECT 'BBBBB';
INSERT INTO #CleanMe(DirtyString) SELECT 'AB-String-BA';
SET #i = #i + 1
END;
WITH FixedString (Step, String, cid) AS (
SELECT 1 AS Step, REPLACE(DirtyString, badString, replaceString), clean_id
FROM #BadChar, #CleanMe
WHERE id = 1
UNION ALL
SELECT Step + 1, REPLACE(String, badString, replaceString), cid
FROM FixedString AS T1
JOIN #BadChar AS T2 ON T1.step + 1 = T2.id
Join #CleanMe AS T3 on T1.cid = t3.clean_id
)
SELECT String FROM FixedString WHERE step = (SELECT MAX(STEP) FROM FixedString);
DROP TABLE #BadChar;
DROP TABLE #CleanMe;
Use a CLR
It seems like this is a common solution many people use, but the environment I'm in doesn't make this a very easy one to embark on.
Are there any other ways to go about this I've over looked? Or any improvements upon the methods I've already looked into for this?

Leveraging the idea from Alan Burstein's solution, you could do something like this, if you wanted to hard code the bad/replace strings. This would work for bad/replace strings longer than a single character as well.
CREATE FUNCTION [dbo].[CleanStringV1]
(
#String nvarchar(4000)
)
RETURNS nvarchar(4000) WITH SCHEMABINDING AS
BEGIN
SELECT #string = REPLACE
(
#string COLLATE Latin1_General_BIN,
badString,
replaceString
)
FROM
(VALUES
('A', '^')
, ('B', '}')
, ('s', '5')
, ('-', ' ')
) t(badString, replaceString)
RETURN #string;
END;
Or, if you have a table containing the bad/replace strings, then
CREATE FUNCTION [dbo].[CleanStringV2]
(
#String nvarchar(4000)
)
RETURNS nvarchar(4000) AS
BEGIN
SELECT #string = REPLACE
(
#string COLLATE Latin1_General_BIN,
badString,
replaceString
)
FROM BadChar
RETURN #string;
END;
These are case sensitive. You can remove the COLLATE bit if you want case insensitive. I did a few small tests, and these were not much slower than nested REPLACE. The first one with the hardcoded strings was a the faster of the two, and was nearly as fast as nested REPLACE.

Related

Searching for multiple patterns in a string in T-SQL

In t-sql my dilemma is that I have to parse a potentially long string (up to 500 characters) for any of over 230 possible values and remove them from the string for reporting purposes. These values are a column in another table and they're all upper case and 4 characters long with the exception of two that are 5 characters long.
Examples of these values are:
USFRI
PROME
AZCH
TXJS
NYDS
XVIV. . . . .
Example of string before:
"Offered to XVIV and USFRI as back ups. No response as of yet."
Example of string after:
"Offered to and as back ups. No response as of yet."
Pretty sure it will have to be a UDF but I'm unable to come up with anything other than stripping ALL the upper case characters out of the string with PATINDEX which is not the objective.
This is unavoidably cludgy but one way is to split your string into rows, once you have a set of words the rest is easy; Simply re-aggregate while ignoring the matching values*:
with t as (
select 'Offered to XVIV and USFRI as back ups. No response as of yet.' s
union select 'Another row AZCH and TXJS words.'
), v as (
select * from (values('USFRI'),('PROME'),('AZCH'),('TXJS'),('NYDS'),('XVIV'))v(v)
)
select t.s OriginalString, s.Removed
from t
cross apply (
select String_Agg(j.[value], ' ') within group(order by Convert(tinyint,j.[key])) Removed
from OpenJson(Concat('["',replace(s, ' ', '","'),'"]')) j
where not exists (select * from v where v.v = j.[value])
)s;
* Requires a fully-supported version of SQL Server.
build a function to do the cleaning of one sentence, then call that function from your query, something like this SELECT Col1, dbo.fn_ReplaceValue(Col1) AS cleanValue, * FROM MySentencesTable. Your fn_ReplaceValue will be something like the code below, you could also create the table variable outside the function and pass it as parameter to speed up the process, but this way is all self contained.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION fn_ReplaceValue(#sentence VARCHAR(500))
RETURNS VARCHAR(500)
AS
BEGIN
DECLARE #ResultVar VARCHAR(500)
DECLARE #allValues TABLE (rowID int, sValues VARCHAR(15))
DECLARE #id INT = 0
DECLARE #ReplaceVal VARCHAR(10)
DECLARE #numberOfValues INT = (SELECT COUNT(*) FROM MyValuesTable)
--Populate table variable with all values
INSERT #allValues
SELECT ROW_NUMBER() OVER(ORDER BY MyValuesCol) AS rowID, MyValuesCol
FROM MyValuesTable
SET #ResultVar = #sentence
WHILE (#id <= #numberOfValues)
BEGIN
SET #id = #id + 1
SET #ReplaceVal = (SELECT sValue FROM #allValues WHERE rowID = #id)
SET #ResultVar = REPLACE(#ResultVar, #ReplaceVal, SPACE(0))
END
RETURN #ResultVar
END
GO
I suggest creating a table (either temporary or permanent), and loading these 230 string values into this table. Then use it in the following delete:
DELETE
FROM yourTable
WHERE col IN (SELECT col FROM tempTable);
If you just want to view your data sans these values, then use:
SELECT *
FROM yourTable
WHERE col NOT IN (SELECT col FROM tempTable);

How can I match all the rows from one table to a column in another table in t-sql?

I don't know if the method I have in mind (which the question relates to) is the best way to arrive at the desired result, so if anyone has a better idea I'm all ears. Here's what I'm trying to accomplish. Given a space-delimited set of search terms, I want to search a particular column, with consistent results regardless of the order of the terms. So e.g. "brown dog" and "dog brown" should return similar results. I have a table-valued function that takes a string and a delimiter and returns the split string as a single-column table.
CREATE FUNCTION dbo.Split
(
#char_array varchar(500), #delimiter char(1)
)
RETURNS
#parsed_array table
(
Parsed varchar(50)
)
AS
BEGIN
DECLARE #parsed varchar(50), #pos int
SET #char_array = LTRIM(RTRIM(#char_array))+ #delimiter
SET #pos = CHARINDEX(#delimiter, #char_array, 1)
IF REPLACE(#char_array, #delimiter, '') <> ''
BEGIN
WHILE #pos > 0
BEGIN
SET #parsed = LTRIM(RTRIM(LEFT(#char_array, #pos - 1)))
IF #parsed <> ''
BEGIN
INSERT INTO #parsed_array (Parsed)
VALUES (#parsed)
END
SET #char_array = RIGHT(#char_array, LEN(#char_array) - #pos)
SET #pos = CHARINDEX(#delimiter, #char_array, 1)
END
END
RETURN
END
GO
The column I want to search is a varchar(2000). Right now I'm searching it like so:
SELECT ID FROM Table WHERE Description LIKE '%' + #searchTerm + '%'
But that doesn't return consistent results given the scenario I described above where multiple terms are in a different order. Is it possible to somehow join the table from the function to the column and get the desired result? Note the results should contain all rows where the Description contains all search terms in any order.
To answer your question just from a SQL perspective, Yes, you can join the function to the table (I've gone via a temp table though) and use a HAVING clause to make sure all search terms are included, something like this.
SELECT
f.Parsed
INTO
#s
FROM
dbo.Split(#terms) f
DECLARE #c INT = (SELECT COUNT(*) FROM #s)
SELECT
t.ID
FROM
Table t
INNER JOIN #s s on t.Description LIKE '%' + s.Parsed + '%'
GROUP BY
t.ID
HAVING
COUNT(*) = #c
However, as comments to the question allude, this approach to searching leaves a lot of open questions. If you can take a look at SQL Server Full-Text Search - more effort to setup but much better results.

How can I complete this Excel function in SQL Server?

I have approximately 30,000 records where I need to split the Description field and so far I can only seem to achieve this in Excel. An example Description would be:
1USBCP 2RJ45C6 1DVI 1DP 3MD 3MLP HANDS
Below is my Excel function:
=TRIM(MID(SUBSTITUTE($G309," ",REPT(" ",LEN($G309))),((COLUMNS($G309:G309)-1)*LEN($G309))+1,LEN($G309)))
This is then dragged across ten Excel columns, and splits the description field at each space.
I have seen many questions asked about splitting a string in SQL but they only seem to cover one space, not multiple spaces.
There is no easy function in SQL server to split strings. At least I don't know it. I use usually some trick that I found somewhere in the Internet some time ago. I modified it to your example.
The trick is that first we try to figure out how many columns do we need. We can do it by checking how many empty strings we have in the string. The easiest way is lenght of string - lenght of string without empty string.
After that for each string we try to find start and end of each word by position. At the end we cut simply string by start and end position and assign to coulmns. The details are in the query. Have fun!
CREATE TABLE test(id int, data varchar(100))
INSERT INTO test VALUES (1,'1USBCP 2RJ45C6 1DVI 1DP 3MD 3MLP HANDS')
INSERT INTO test VALUES (2,'Shorter one')
DECLARE #pivot varchar(8000)
DECLARE #select varchar(8000)
SELECT
#pivot=coalesce(#pivot+',','')+'[col'+cast(number+1 as varchar(10))+']'
FROM
master..spt_values where type='p' and
number<=(SELECT max(len(data)-len(replace(data,',',''))) FROM test)
SELECT
#select='
select p.*
from (
select
id,substring(data, start+2, endPos-Start-2) as token,
''col''+cast(row_number() over(partition by id order by start) as varchar(10)) as n
from (
select
id, data, n as start, charindex('','',data,n+2) endPos
from (select number as n from master..spt_values where type=''p'') num
cross join
(
select
id, '' '' + data +'' '' as data
from
test
) m
where n < len(data)-1
and substring(odata,n+1,1) = '','') as data
) pvt
Pivot ( max(token)for n in ('+#pivot+'))p'
EXEC(#select)
Here you can find example in SQL Fiddle
I didn't notice that you want to get rid of multiple blank spaces.
To do it please create some function that preprare your data :
CREATE FUNCTION dbo.[fnRemoveExtraSpaces] (#Number AS varchar(1000))
Returns Varchar(1000)
As
Begin
Declare #n int -- Length of counter
Declare #old char(1)
Set #n = 1
--Begin Loop of field value
While #n <=Len (#Number)
BEGIN
If Substring(#Number, #n, 1) = ' ' AND #old = ' '
BEGIN
Select #Number = Stuff( #Number , #n , 1 , '' )
END
Else
BEGIN
SET #old = Substring(#Number, #n, 1)
Set #n = #n + 1
END
END
Return #number
END
After that use the new version that removes extra spaces.
DECLARE #pivot varchar(8000)
DECLARE #select varchar(8000)
SELECT
#pivot=coalesce(#pivot+',','')+'[col'+cast(number+1 as varchar(10))+']'
FROM
master..spt_values where type='p' and
number<=(SELECT max(len(dbo.fnRemoveExtraSpaces(data))-len(replace(dbo.fnRemoveExtraSpaces(data),' ',''))) FROM test)
SELECT
#select='
select p.*
from (
select
id,substring(data, start+2, endPos-Start-2) as token,
''col''+cast(row_number() over(partition by id order by start) as varchar(10)) as n
from (
select
id, data, n as start, charindex('' '',data,n+2) endPos
from (select number as n from master..spt_values where type=''p'') num
cross join
(
select
id, '' '' + dbo.fnRemoveExtraSpaces(data) +'' '' as data
from
test
) m
where n < len(data)-1
and substring(data,n+1,1) = '' '') as data
) pvt
Pivot ( max(token)for n in ('+#pivot+'))p'
EXEC(#select)
I am probably not understanding your question, but all that you are doing in that formula, can be done almost exactly the same in SQL. I see someone has already answered but to my mind, how can it be necessary to do all that when you can do this. I might be wrong. But here goes.
declare #test as varchar(100)
set #test='abcd1234567'
select right(#test,2)
, left(#test,2)
, len(#test)
, case when len(#test)%2>0
then left(right(#test,round(len(#test)/2,0)+1),1)
else left(right(#test,round(len(#test)/2,0)+1),2) end
Results
67 ab 11 2
So right, left, length and mid can all be achieved.
If the spaces are the "substring" dividers, then: I dont remember well the actual syntax for do-while inside selects of sql, neither have i actually done that per se, but I don't see why it should not be possible. If it doesn't work then you need a temporary table and if that does not work you need a cursor. The cursor would be an external loop around this one to fetch and process a single string at a time. Or you can do something more clever. I am just a novice.
declare #x varchar(1)
declare #n integer
declare #i integer
declare #str varchar(100) -- this is your description. Fetch it and assign it. if in a cursor just use column-name
set #x = null
set #n = 0
set #i = 0
while n < len(#str)
while NOT #x = " "
begin
set #x = left(right(#str,n),1)
n = n+1
end
--insert into or update #temptable blablabla here.
Use i and n to locate substring and then left(right()) it out. or you can SELECT it, but that is a messy procedure if the number of substrings are long. Continue with:
set i = n
set #str = right(#str, i) -- this includes the " ". left() it out at will.
end
Now, a final comment, there should perhaps be a third loop checking for if you are at the last "substring" because I see now this code will throw error when it gets to the end. or "add" an empty space at the end to #str, that will also work. But my time is up. This is a suggestion at least.

TSQL - A Better INT Conversion Function

I'm wondering if there is a better way to 'parse' a Varchar to an Int in TSQL / SQL Server. I say 'parse' because I need something more robust than the CAST/CONVERT system funcs; it's particularly useful to return NULL when the parse fails, or even a 'default' value.
So here's the function I'm using now, originally obtained from someone's SQL blog (can't even remember specifically who)...
ALTER FUNCTION [dbo].[udf_ToNumber]
(
#Str varchar(max)
)
RETURNS int
AS
BEGIN
DECLARE #Result int
SET #Str = LTRIM(RTRIM(#Str))
IF (#Str='' OR #Str IS NULL
OR ISNUMERIC(#Str)=0
OR #Str LIKE '%[^-+ 0-9]%'
OR #Str IN ('.', '-', '+', '^')
)
SET #Result = NULL
ELSE
IF (CAST(#Str AS NUMERIC(38,0)) NOT BETWEEN -2147483648. AND 2147483647.)
SET #Result = NULL
ELSE
SET #Result = CAST(#Str AS int)
RETURN #Result
END
(And you could add a line before the end, like "if #Result is null, set #Result = ", or something like that).
It's not very efficient, because using it in a JOIN or WHERE-IN-SELECT -- where say the LEFT column is INT and the RIGHT is VARCHAR, and I try to parse the RIGHT -- on any significantly large data-set, takes a lot longer than if I CAST the LEFT (INT) column to a VARCHAR first and then do the JOIN.
Anyway, I know 'ideally' that I shouldn't need to do this kind of thing in the first place if my tables/data-types are created & populated appropriately, but we all know the ideal world is very far from reality sometimes, so humor me. Thanks!
EDIT: SQL Server versions 2005 & 2008; boxes running 2005 will be upgraded soon so 2008-specific answers are fine.
In my experience, scalar udf's don't perform well on larger data sets; as a workaround you can try one of two options (and I'm not sure either of them will work particularly well):
Embed the logic of the function in the join itself, like so:
SELECT columnlist
FROM a JOIN b ON a.INT = (SELECT CASE WHEN ( b.varchar= ''
OR b.varchar IS NULL
OR ISNUMERIC(b.varchar) = 0
OR b.varchar LIKE '%[^-+ 0-9]%'
OR b.varchar IN ( '.', '-', '+', '^' )
) THEN NULL
WHEN CAST(b.varchar AS NUMERIC(38, 0)) NOT BETWEEN -2147483648.
AND 2147483647.
THEN NULL
ELSE CAST (b.varchar AS INT)
END)
Change your user-defined function to be a inline table-valued function and use the CROSS APPLY syntax:
CREATE FUNCTION udf_ToInt
(
#str VARCHAR(MAX)
)
RETURNS TABLE
AS
RETURN
(
SELECT CASE WHEN ( #Str = ''
OR #Str IS NULL
OR ISNUMERIC(#Str) = 0
OR #Str LIKE '%[^-+ 0-9]%'
OR #Str IN ( '.', '-', '+', '^' )
) THEN NULL
WHEN CAST(#Str AS NUMERIC(38, 0)) NOT BETWEEN -2147483648.
AND 2147483647.
THEN NULL
ELSE CAST (#Str AS INT) as IntVal
END
)
GO
SELECT columnlist
FROM b
CROSS APPLY udf_ToInt(b.varchar) t
JOIN a ON t.IntVal = a.Int
Probably easier to just convert to VARCHAR and compare :)

User defined function replacing WHERE col IN(...)

I have created a user defined function to gain performance with queries containing 'WHERE col IN (...)' like this case:
SELECT myCol1, myCol2
FROM myTable
WHERE myCol3 IN (100, 200, 300, ..., 4900, 5000);
The queries are generated from an web application and are in some cases much more complex.
The function definition looks like this:
CREATE FUNCTION [dbo].[udf_CSVtoIntTable]
(
#CSV VARCHAR(MAX),
#Delimiter CHAR(1) = ','
)
RETURNS
#Result TABLE
(
[Value] INT
)
AS
BEGIN
DECLARE #CurrStartPos SMALLINT;
SET #CurrStartPos = 1;
DECLARE #CurrEndPos SMALLINT;
SET #CurrEndPos = 1;
DECLARE #TotalLength SMALLINT;
-- Remove space, tab, linefeed, carrier return
SET #CSV = REPLACE(#CSV, ' ', '');
SET #CSV = REPLACE(#CSV, CHAR(9), '');
SET #CSV = REPLACE(#CSV, CHAR(10), '');
SET #CSV = REPLACE(#CSV, CHAR(13), '');
-- Add extra delimiter if needed
IF NOT RIGHT(#CSV, 1) = #Delimiter
SET #CSV = #CSV + #Delimiter;
-- Get total string length
SET #TotalLength = LEN(#CSV);
WHILE #CurrStartPos < #TotalLength
BEGIN
SET #CurrEndPos = CHARINDEX(#Delimiter, #CSV, #CurrStartPos);
INSERT INTO #Result
VALUES (CAST(SUBSTRING(#CSV, #CurrStartPos, #CurrEndPos - #CurrStartPos) AS INT));
SET #CurrStartPos = #CurrEndPos + 1;
END
RETURN
END
The function is intended to be used like this (or as an INNER JOIN):
SELECT myCol1, myCol2
FROM myTable
WHERE myCol3 IN (
SELECT [Value]
FROM dbo.udf_CSVtoIntTable('100, 200, 300, ..., 4900, 5000', ',');
Do anyone have some optimiztion idears of my function or other ways to improve performance in my case?
Is there any drawbacks that I have missed?
I am using MS SQL Server 2005 Std and .NET 2.0 framework.
I'm not sure of the performance increase, but I would use it as an inner join and get away from the inner select statement.
Using a UDF in a WHERE clause or (worse) a subquery is asking for trouble. The optimizer sometimes gets it right, but often gets it wrong and evaluates the function once for every row in your query, which you don't want.
If your parameters are static (they appear to be) and you can issue a multistatement batch, I'd load the results of your UDF into a table variable, then use a join against the table variable to do your filtering. This should work more reliably.
that loop will kill performance!
create a table like this:
CREATE TABLE Numbers
(
Number int not null primary key
)
that has rows containing values 1 to 8000 or so and use this function:
CREATE FUNCTION [dbo].[FN_ListAllToNumberTable]
(
#SplitOn char(1) --REQUIRED, the character to split the #List string on
,#List varchar(8000) --REQUIRED, the list to split apart
)
RETURNS
#ParsedList table
(
RowNumber int
,ListValue varchar(500)
)
AS
BEGIN
/*
DESCRIPTION: Takes the given #List string and splits it apart based on the given #SplitOn character.
A table is returned, one row per split item, with a columns named "RowNumber" and "ListValue".
This function workes for fixed or variable lenght items.
Empty and null items will be included in the results set.
PARAMETERS:
#List varchar(8000) --REQUIRED, the list to split apart
#SplitOn char(1) --OPTIONAL, the character to split the #List string on, defaults to a comma ","
RETURN VALUES:
a table, one row per item in the list, with a column name "ListValue"
TEST WITH:
----------
SELECT * FROM dbo.FN_ListAllToNumTable(',','1,12,123,1234,54321,6,A,*,|||,,,,B')
DECLARE #InputList varchar(200)
SET #InputList='17;184;75;495'
SELECT
'well formed list',LEFT(#InputList,40) AS InputList,h.Name
FROM Employee h
INNER JOIN dbo.FN_ListAllToNumTable(';',#InputList) dt ON h.EmployeeID=dt.ListValue
WHERE dt.ListValue IS NOT NULL
SET #InputList='17;;;184;75;495;;;'
SELECT
'poorly formed list join',LEFT(#InputList,40) AS InputList,h.Name
FROM Employee h
INNER JOIN dbo.FN_ListAllToNumTable(';',#InputList) dt ON h.EmployeeID=dt.ListValue
SELECT
'poorly formed list',LEFT(#InputList,40) AS InputList, ListValue
FROM dbo.FN_ListAllToNumTable(';',#InputList)
**/
/*this will return empty rows, and row numbers*/
INSERT INTO #ParsedList
(RowNumber,ListValue)
SELECT
ROW_NUMBER() OVER(ORDER BY number) AS RowNumber
,LTRIM(RTRIM(SUBSTRING(ListValue, number+1, CHARINDEX(#SplitOn, ListValue, number+1)-number - 1))) AS ListValue
FROM (
SELECT #SplitOn + #List + #SplitOn AS ListValue
) AS InnerQuery
INNER JOIN Numbers n ON n.Number < LEN(InnerQuery.ListValue)
WHERE SUBSTRING(ListValue, number, 1) = #SplitOn
RETURN
END /*Function FN_ListAllToNumTable*/
I have other versions that do not return empty or null rows, ones that return just the item and not the row number, etc. Look in the header comment to see how to use this as part of a JOIN, which is much faster than in a where clause.
The CLR solution did not give me an good performance so I will use a recursive query. So here is the definition of the SP I will use (mostly based on Erland Sommarskogs examples):
CREATE FUNCTION [dbo].[priudf_CSVtoIntTable]
(
#CSV VARCHAR(MAX),
#Delimiter CHAR(1) = ','
)
RETURNS
#Result TABLE
(
[Value] INT
)
AS
BEGIN
-- Remove space, tab, linefeed, carrier return
SET #CSV = REPLACE(#CSV, ' ', '');
SET #CSV = REPLACE(#CSV, CHAR(9), '');
SET #CSV = REPLACE(#CSV, CHAR(10), '');
SET #CSV = REPLACE(#CSV, CHAR(13), '');
WITH csvtbl(start, stop) AS
(
SELECT start = CONVERT(BIGINT, 1),
stop = CHARINDEX(#Delimiter, #CSV + #Delimiter)
UNION ALL
SELECT start = stop + 1,
stop = CHARINDEX(#Delimiter, #CSV + #Delimiter, stop + 1)
FROM csvtbl
WHERE stop > 0
)
INSERT INTO #Result
SELECT CAST(SUBSTRING(#CSV, start, CASE WHEN stop > 0 THEN stop - start ELSE 0 END) AS INT) AS [Value]
FROM csvtbl
WHERE stop > 0
OPTION (MAXRECURSION 1000)
RETURN
END
Thank for the input, I have to admit that I have made som bad research before I started my work. I found that Erland Sommarskog has written a lot of this problem on his webpage, after your responeses and after reading his page I decided that I will try to make a CLR to solve this.
I tried a recursive query, this resulted in good performance but I will try CLR function anyway.

Resources