Using SELECT result as string_pattern in REPLACE function - sql-server

On SQL Server 2019, I would like to write a Scalar-value Function returning a string stripped from a series of possible substrings.
Assuming all of the substrings I want to replace with '' are in table MY_TABLE, in column MY_SUBSTRINGS, what would be the most efficient way to use the results of SELECT MY_SUBSTRINGS FROM MY_TABLE as string patterns in a REPLACE function to strip #MyString from the substrings.
Should I store the result of the SELECT in a table variable and loop through each of the substrings and call REPLACE for each substring possibilities or is there a better way to do this?

If I understand your question:
Here is a little demonstration how you can replace multiple values
Let's pretend the table variable #Map is an actual table
Example
Declare #Map Table (sFrom varchar(100))
Insert Into #Map values
('some')
,('begin')
,('curDateTime')
Declare #S varchar(max) = 'This is some string to begin testing [curDateTime]'
Select #S=replace(#S,sFrom,'')
From (Select top 1000 * From #Map Order By len(sFrom) Desc) A
Select ltrim(rtrim(replace(replace(replace(#S,' ','†‡'),'‡†',''),'†‡',' ')))
Returns
This is string to testing []
Note: The final Select ltrim(rtrim(...)) strips any number of duplicating spaces. This is optional

Related

REPLACE function does not replace desired character string

I want to replace all occurrences of a particular single character string (eg.:'^'or ',') when creating a view that is based on a single table. But id does not replace the desired single character in all the the data rows. I know it when I query the newly created view. All fields have varchar datatype.
This is a specific a example where the desire string does not get replaced MAINTENANCE¿ENHANCED
I tried the following and none worked.
SELECT REPLACE('MAINTENANCE¿ENHANCED',',','')
SELECT REPLACE('MAINTENANCE¿ENHANCED',char(33),'')
SELECT REPLACE(N'MAINTENANCE¿ENHANCED',',','')
SELECT REPLACE('MAINTENANCE¿ENHANCED',N',','')
SELECT REPLACE(CAST('MAINTENANCE¿ENHANCED' as NVARCHAR(50)),N',','')
SELECT REPLACE(CAST('MAINTENANCE¿ENHANCED' as VARCHAR(50)),N',','')
SELECT REPLACE(TRY_CAST('MAINTENANCE¿ENHANCED' as VARCHAR(50)),N',','')
SELECT REPLACE(CONVERT(VARCHAR(50),'MAINTENANCE¿ENHANCED'), N',','')
Also I performed simple test I copied the comma from the string from where I need it to be replaced (see my code below)
if ',' = '‚' print 1 -- DOES NOT return TRUE. 1st comma is the one I typed in the second argument of the REPLACE function, the 2nd comma is the one copied from the string above.
if ',' = ',' print 1 -- RETURNs TRUE. Both of the commas that I typed in the second argument of the REPLACE function.
Apparently the issue is with my comma in the data source which is being treated as equally. Though the functions below shows that both are varchar. ( https://blog.sqlauthority.com/2013/12/15/sql-server-how-to-identify-datatypes-and-properties-of-variable )
**-- comma from the string**
DECLARE #myVar VARCHAR(100)
SET #myVar = '‚'
SELECT SQL_VARIANT_PROPERTY(#myVar,'BaseType') BaseType,
SQL_VARIANT_PROPERTY(#myVar,'Precision') Precisions,
SQL_VARIANT_PROPERTY(#myVar,'Scale') Scale,
SQL_VARIANT_PROPERTY(#myVar,'TotalBytes') TotalBytes,
SQL_VARIANT_PROPERTY(#myVar,'Collation') Collation,
SQL_VARIANT_PROPERTY(#myVar,'MaxLength') MaxLengths
--**regular comma**
SET #myVar = ','
SELECT SQL_VARIANT_PROPERTY(#myVar,'BaseType') BaseType,
SQL_VARIANT_PROPERTY(#myVar,'Precision') Precisions,
SQL_VARIANT_PROPERTY(#myVar,'Scale') Scale,
SQL_VARIANT_PROPERTY(#myVar,'TotalBytes') TotalBytes,
SQL_VARIANT_PROPERTY(#myVar,'Collation') Collation,
SQL_VARIANT_PROPERTY(#myVar,'MaxLength') MaxLengths
Partially this can be resolved using this code below
select Stuff('MAINTENANCE¿ENHANCED', PatIndex('%[^a-z0-9]%', 'MAINTENANCE¿ENHANCED'), 1, '')
OUTPUT
-- the comma is replaced. That is what is expected.
MAINTENANCEÿENHANCED
But it does not work in I have more than 1 comma regardless if I copy it from the data source or type it in myself.
('‚MAINTENANCE¿ENHANCED')
select Stuff('‚MAINTENANCE¿ENHANCED', PatIndex('%[^a-z0-9]%', '‚MAINTENANCE¿ENHANCED'), 1, '')
select REPLACE(Stuff('‚MAINTENANCE¿ENHANCED', PatIndex('%[^a-z0-9]%', '‚MAINTENANCE¿ENHANCED'), 1, ''),',','')
OUTPUT
-- the comma is back again. The is the Issues. Only one (first) comma is replaced.
AINTENANCE¿ENHANCED
P.S.
Please refer to my answer below where I resolved all the above described issues except that I need to figure out how to preserver from removal special characters like question marks, parenthetic, etc.
PARTIALLY this can be resolved using this code below that I got from here
use MyDB;
go
drop function if exists [dbo].[RemoveNonAlphaCharacters]
go
Create Function [dbo].[RemoveNonAlphaCharacters](#Temp VarChar(1000))
Returns VarChar(1000)
AS
Begin
Declare #KeepValues as varchar(50)
Set #KeepValues = '%[^ a-z0-9]%'
While PatIndex(#KeepValues, #Temp) > 0
Set #Temp = Stuff(#Temp, PatIndex(#KeepValues, #Temp), 1, '')
Return #Temp
End
SELECT MyDB.dbo.RemoveNonAlphaCharacters(', (/!:\£&^?-:;|\)?%$"é觰àçò*MAIN,2TENANCE¿ENHANCED 123 asds %[ ..')
I got this from
How to strip all non-alphabetic characters from string in SQL Server?
OUTPUT
éèàçòMAIN2TENANCEÃÂENHANCED 123 asds
The issues here that it removes all non-alphabetic string characters such as &^?-:;|)? ]% ;:_|!"
I could not fugure out how to pass regular expression to preserver all (except for comma which needs to be replaced) characters in the printable section of the ASCII table (see example above and follow the links below)
https://www.rexegg.com/regex-quickstart.html
http://www.asciitable.com/

T-SQL: Splitting character values between non-consistent instances of delimiters

There is a string accompanying a value that I need to extract from a column. I can extract the value from most of the rows, but there are a few cases where the value has different properties. This is a simplified example of the problem;
IF OBJECT_ID('TEMPDB..#TABLE') IS NOT NULL
DROP TABLE #TABLE
CREATE TABLE #TABLE(
colSTRING NVARCHAR(MAX) NULL
);
INSERT INTO #TABLE (colSTRING)
VALUES (',SHOULD NOT BE STORED THIS WAY:22.67')
,(',SHOULD NOT BE STORED THIS WAY:46.32')
,(',SHOULD NOT BE STORED THIS WAY:23.45')
,(',SHOULD NOT BE STORED THIS WAY:66.67')
,(',SHOULD NOT BE STORED THIS WAY:22.35,ANOTHER BAD THING:OK')
;
SELECT * FROM #TABLE
Output:
Notice that there is a number at the end of the string to the right of the ':'. This is the number I need to extract.
The bottom row however shows that there is a second string entry in the same cell. I need to extract 22.35 from this cell while omitting the rest of the string.
This is what I have so far;
SELECT
(RIGHT(colSTRING,CHARINDEX(':',REVERSE(colSTRING))-1)) [STRING NUMBER]
FROM #TABLE
output:
This works for the other values in the table, but the bottom row does not extract the correct value. It takes the string to the right of the ':' of the second string entry.
Is there some way to use this logic on only the first occurrence of the ':'?
So this is how I solved the problem, thanks to #MartinSmith 's idea. I adjusted the example a bit to show how this interacts with a number with more than 2 digits (>=100.00).
IF OBJECT_ID('TEMPDB..#TABLE') IS NOT NULL
DROP TABLE #TABLE
CREATE TABLE #TABLE(
colSTRING NVARCHAR(MAX) NULL
);
INSERT INTO #TABLE (colSTRING)
VALUES (',SHOULD NOT BE STORED THIS WAY:22.67')
,(',SHOULD NOT BE STORED THIS WAY:46.32')
,(',SHOULD NOT BE STORED THIS WAY:23.45')
,(',SHOULD NOT BE STORED THIS WAY:766.67')
,(',SHOULD NOT BE STORED THIS WAY:22.35,ANOTHER BAD THING:OK')
;
SELECT * FROM #TABLE
Solution: In this case, every string entry always starts with a comma. I can use that information in a CASE statement. I make a column with populated entries for each case when there are numbers <100.00 or >=100.00
SELECT ISNULL(CASE WHEN [2DIGITS] LIKE ',%' THEN NULL ELSE [2DIGITS] END,[3DIGITS]) [FIXED]
FROM(
SELECT
(RIGHT(colSTRING,CHARINDEX(':',REVERSE(colSTRING))-1)) [STRING NUMBER]
,SUBSTRING(colSTRING,1 + PATINDEX('%:[0-9][0-9].[0-9][0-9]%', colSTRING),5) [2DIGITS]
,SUBSTRING(colSTRING,1 + PATINDEX('%:[0-9][0-9][0-9].[0-9][0-9]%', colSTRING),6) [3DIGITS]
FROM #TABLE
)A

"Create sql function , select english characters?"

I am looking for a function that selects English numbers and letters only:
Example:
TEKA תנור ביל דין in HLB-840 P-WH לבן
I want to run a function and get the following result:
TEKA HLB-840 P-WH
I'm using MS SQL Server 2012
What you really need here is regex replacement, which SQL Server does not support. Broadly speaking, you would want to find [^A-Za-z0-9 -]+\s* and then replace with empty string. Here is a demo showing that this works as expected:
Demo
This would output TEKA in HLB-840 P-WH for the input you provided. You might be able to do this in SQL Server using a regex package or UDF. Or, you could do this replacement outside of SQL using any number of tools which support regex (e.g. C#).
SQL-Server is not the right tool for this.
The following might work for you, but there is no guarantee:
declare #yourString NVARCHAR(MAX)=N'TEKA תנור ביל דין in HLB-840 P-WH לבן';
SELECT REPLACE(REPLACE(REPLACE(REPLACE(CAST(#yourString AS VARCHAR(MAX)),'?',''),' ','|~'),'~|',''),'|~',' ');
The idea in short:
A cast of NVARCHAR to VARCHAR will return all characters in your string, which are not known in the given collation, as question marks. The rest is replacements of question marks and multi-blanks.
If your string can include a questionmark, you can replace it first to a non-used character, which you re-replace at the end.
If you string might include either | or ~ you should use other characters for the replacements of multi-blanks.
You can influence this approach by specifying a specific collation, if some characters pass by...
there is no build in function for such purpose, but you can create your own function, should be something like this:
--create function (split string, and concatenate required)
CREATE FUNCTION dbo.CleanStringZZZ ( #string VARCHAR(100))
RETURNS VARCHAR(100)
BEGIN
DECLARE #B VARCHAR(100) = '';
WITH t --recursive part to create sequence 1,2,3... but will better to use existing table with index
AS
(
SELECT n = 1
UNION ALL
SELECT n = n+1 --
FROM t
WHERE n <= LEN(#string)
)
SELECT #B = #B+SUBSTRING(#string, t.n, 1)
FROM t
WHERE SUBSTRING(#string, t.n, 1) != '?' --this is just an example...
--WHERE ASCII(SUBSTRING(#string, t.n, 1)) BETWEEN 32 AND 127 --you can use something like this
ORDER BY t.n;
RETURN #B;
END;
and then you can use this function in your select statement:
SELECT dbo.CleanStringZZZ('TEKA תנור ביל דין in HLB-840 P-WH לבן');
create function dbo.AlphaNumericOnly(#string varchar(max))
returns varchar(max)
begin
While PatIndex('%[^a-z0-9]%', #string) > 0
Set #string = Stuff(#string, PatIndex('%[^a-z0-9]%', #string), 1, '')
return #string
end

Most effective way to check sub-string exists in comma-separated string in SQL Server

I have a comma-separated list column available which has values like
Product1, Product2, Product3
I need to search whether the given product name exists in this column.
I used this SQL and it is working fine.
Select *
from ProductsList
where productname like '%Product1%'
This query is working very slowly. Is there a more efficient way I can search for a product name in the comma-separated list to improve the performance of the query?
Please note I have to search comma separated list before performing any other select statements.
user defined functions for comma separation of the string
Create FUNCTION [dbo].[BreakStringIntoRows] (#CommadelimitedString varchar(max))
RETURNS #Result TABLE (Column1 VARCHAR(max))
AS
BEGIN
DECLARE #IntLocation INT
WHILE (CHARINDEX(',', #CommadelimitedString, 0) > 0)
BEGIN
SET #IntLocation = CHARINDEX(',', #CommadelimitedString, 0)
INSERT INTO #Result (Column1)
--LTRIM and RTRIM to ensure blank spaces are removed
SELECT RTRIM(LTRIM(SUBSTRING(#CommadelimitedString, 0, #IntLocation)))
SET #CommadelimitedString = STUFF(#CommadelimitedString, 1, #IntLocation, '')
END
INSERT INTO #Result (Column1)
SELECT RTRIM(LTRIM(#CommadelimitedString))--LTRIM and RTRIM to ensure blank spaces are removed
RETURN
END
Declare #productname Nvarchar(max)
set #productname='Product1,Product2,Product3'
select * from product where [productname] in(select * from [dbo].[![enter image description here][1]][1][BreakStringIntoRows](#productname))
Felix is right and the 'right answer' is to normalize your table. Although, maybe you have 500k lines of code that expect this column to exist as it is. So your next best (non-destructive) answer is:
Create a table to hold normalize data:
CREATE TABLE ProductsList2 (ProductId INT, ProductName VARCHAR)
Create a TRIGGER that on UPDATE/INSERT/DELETE maintains ProductList2 by splitting the string 'Product1,Product2,Product3' into three records.
Index your new table.
Query against your new table:
SELECT *
FROM ProductsList
WHERE ProductId IN (SELECT x.ProductId
FROM ProductsList2 x
WHERE x.ProductName = 'Product1')

What is the best way to concatonate three varchar fields, each of which is either null or contains a comma separated string?

We have a result set that has three fields and each of those fields is either null or contains a comma separated list of strings.
We need to combine all three into one comma separated list and eliminate duplicates.
What is the best way to do that?
I found a nice function that can split a string and return a table:
T-SQL split string
I tried to create a UDF that would take three varchar parameters and call that split string function three times, combine them into one table, and then use a FOR XML from there and return it as one comma separated string.
But SQL is complaining about having a SELECT in a function.
Here's an example using the SplitString function you referenced.
DECLARE
#X varchar(max) = 'A, C, F'
, #Y varchar(max) = null
, #Z varchar(max) = 'A, D, E, A'
;WITH SplitResults as
(
-- Note: the function does not remove leading spaces.
SELECT LTRIM([Name]) [Name] FROM SplitString(#X)
UNION
SELECT LTRIM([Name]) [Name] FROM SplitString(#Y)
UNION
SELECT LTRIM([Name]) [Name] FROM SplitString(#Z)
)
SELECT STUFF((
SELECT ', ' + [Name]
FROM SplitResults
FOR XML PATH(''), TYPE
-- Note: here we're pulling the value out in case any characters were escaped, ie. &
-- and then STUFF is removing the leading ,<space>
).value('.', 'nvarchar(max)'), 1, 2, '')
I would not store data as a comma separated string in a single field. Separate the string to a new table and combine it to a string again when you need to.
Finding duplicates and managing the data will also be much easier.
I've used this function before (I didn't write it, and unfortunately cannot remember where I found it) to split a string and add a key (in this case an int) to the data as a separate table, linking back to the original table's PK
CREATE FUNCTION SplitWithID (#id int, #sep VARCHAR(10), #s VARCHAR(MAX))
RETURNS #t TABLE
(
id int,
val VARCHAR(MAX)
)
AS
BEGIN
DECLARE #xml XML
SET #XML = N'<root><r>' + REPLACE(#s, #sep, '</r><r>') + '</r></root>'
INSERT INTO #t(id,val)
SELECT #id, r.value('.','VARCHAR(40)') as Item
FROM #xml.nodes('//root/r') AS RECORDS(r)
RETURN
END
GO
Once you have the data on separate rows you can use any duplicate removal technique to clean the data before applying a primary key to the table.

Resources