I would like to remove specific characters from a column using regexp.
As an example, I have this:
declare #a nvarchar(50) = '(djfhsd-kjfhksd'
select Replace(#a, Substring(#a, PatIndex('%[^0-9.-]%', #a), 1), '')
But I want to remove just parenthesis (), spaces and dashes -
I don't have experience on regexp but I would like to remove them in one shot.
declare #a nvarchar(50) = '(djfhsd-kjfhksd'
Select #a =Replace(#a,RemChar,'')
From (Values ('('),
(')'),
('-'),
(' ')
) B (RemChar)
Select #a
Returns
djfhsdkjfhksd
For something straight forward. Otherwise you will need a UDF or a Cross Apply
Replace(Replace(Replace(Replace(YourCol,'(',''),')',''),'-',''),' ','')
Related
I have a string called Dats which is either of the general appearence xxxx-nnnnn (where x is a character, and n is a number) or nnn-nnnnnn.
I want to return only the numbers.
For this I've tried:
SELECT Distinct dats,
Left(SubString(artikelnr, PatIndex('%[0-9.-]%', artikelnr), 8000), PatIndex('%[^0-9.-]%', SubString(artikelnr, PatIndex('%[0-9.-]%', artikelnr), 8000) + 'X')-1)
FROM ThatDatabase
It is almost what I want. It removes the regular characters x, but it does not remove the unicode character -. How can I remove this as well? And also, it seems rather ineffective to have two PatIndex functions for every row, is there a way to avoid this? (This will be used on a big database where the result of this Query will be used as keys).
EDIT: Updated as a new database sometimes contained additional -'s or . together with -.
DECLARE #T as table
(
dats nvarchar(10)
)
INSERT INTO #T VALUES
('111BWA30'),
('115-200-11')
('115-22.4-1')
('10.000.22')
('600F-FFF200')
I wasn't sure if you wanted the numbers before the - char as well, but if you do, here is one way to do it:
Create and populate sample table (Please save us this step in your future questions)
DECLARE #T as table
(
dats nvarchar(10)
)
INSERT INTO #T VALUES
('abcde-1234'),
('23-343')
The query:
SELECT dats,
case when patindex('%[^0-9]-[0-9]%', dats) > 0 then
right(dats, len(dats) - patindex('%-[0-9]%', dats))
else
stuff(dats, charindex('-', dats), 1, '')
end As NumbersOnly
FROM #T
Results:
dats NumbersOnly
abcde-1234 1234
23-343 23343
If you want the only the numbers to the right of the - char, it's simpler:
SELECT dats,
right(dats, len(dats) - patindex('%-[0-9]%', dats)) As RightNumbersOnly
FROM #T
Results:
dats RightNumbersOnly
abcde-1234 1234
23-343 343
If you know which characters you need to remove then use REPLACE function
DECLARE #T as table
(
dats nvarchar(100)
)
INSERT INTO #T
VALUES
('111BWA30'),
('115-200-11'),
('115-22.4-1'),
('10.000.22'),
('600F-FFF200')
SELECT REPLACE(REPLACE(dats, '.', ''), '-', '')
FROM #T
The problem:
I have text data imported into the db with a lot of unwanted characters. I need to keep only 4 capital letter strings within the imported text string. Example:
1447;#MIBD (This is a nice name);#2056;#LKRE (Very nice name indeed)
this could be in one column in one row of my table. What I need to extract from the string is:
MIBD and LKRE
And the result should preferably be the desired strings separated with semicolons.
It should be applied to the whole column and I cannot know how many of these 4 upper case letter strings might appear in one row.
Went through all sorts of function like PATINDEX etc. but really do not know how to approach it. thanks for any help!
try this, it assumes that the four char code is always preceded by ;# . As PATINDEX is case insensitive I have added additional check to verify that all the four character are capital.
DECLARE #MyTable Table( ID INT, MyString VARCHAR(8000))
INSERT INTO #MyTable
VALUES
(1, '1447;#MIBD (This is a nice name);#2056;#LKRE (Very nice name indeed)')
,(2, ';#DBCC (This is a nice name);#2056;#LLC (Very nice name indeed) ;#ABCD')
,(3, ';#AaaA;#OPQR;1234 (and) ;#WXYZ')
,(4, ';#abc this empty string without any code')
;WITH CTE AS
(
SELECT ID
,SUBSTRING(MyString, PATINDEX('%;#[A-Z][A-Z][A-Z][A-Z]%',MyString)+2, 4) AS NewString
,STUFF(MyString, 1, PATINDEX('%;#[A-Z][A-Z][A-Z][A-Z]%',MyString)+6, '') AS MyString
FROM #MyTable m
WHERE PATINDEX('%;#[A-Z][A-Z][A-Z][A-Z]%',MyString) > 0
UNION ALL
SELECT ID
,SUBSTRING(MyString, PATINDEX('%;#[A-Z][A-Z][A-Z][A-Z]%',MyString)+2, 4) AS NewString
,STUFF(MyString, 1, PATINDEX('%;#[A-Z][A-Z][A-Z][A-Z]%',MyString)+6, '') AS MyString
FROM CTE c
WHERE PATINDEX('%;#[A-Z][A-Z][A-Z][A-Z]%',MyString) > 0
)
SELECT c.ID,
STUFF(( SELECT '; ' + NewString
FROM CTE c1
WHERE c1.ID = c.ID
AND ASCII(SUBSTRING(NewString, 1, 1)) BETWEEN ASCII('A') AND ASCII('Z') -- first char
AND ASCII(SUBSTRING(NewString, 2, 1)) BETWEEN ASCII('A') AND ASCII('Z') -- second char
AND ASCII(SUBSTRING(NewString, 3, 1)) BETWEEN ASCII('A') AND ASCII('Z') -- third char
AND ASCII(SUBSTRING(NewString, 4, 1)) BETWEEN ASCII('A') AND ASCII('Z') -- fourth char
FOR XML PATH(''), TYPE).value('.', 'VARCHAR(MAX)') -- use the value clause to hanlde xml character issue like, &,",>,<
,1,1,'') AS CodeList
FROM CTE c
GROUP BY ID
OPTION (MAXRECURSION 0);
I came to something like this so far:
ALTER FUNCTION CleanData
(
-- Parameters here
#Text AS VARCHAR(4000)
)
RETURNS VARCHAR(4000)
AS
BEGIN
WHILE PATINDEX('%[0-9#;()]%', #Text) > 0
BEGIN
SET #Text = STUFF(#Text, PATINDEX('%[0-9#;()]%', #Text), 1, '')
END
RETURN #Text
END
But what I get is the Initials and the characters in parantheses as the PATINDEX cannot differ between the upper and lower case. Maybe it might be helpful for somebody else
How I can select
"ALT1" if value is "W61N03D20V0-WHIH-ALT1"
"ALT2" if for "W61N03D20V0-WHIH-ALT2"
"SW" for "W61N03D20V0-WHIH-SW"
"Default" for "W61N26D1YA1-VICU" (without prefix)
"Defailt" for "W61N27D21V2-AZTD"
In other words I'm looking for a way extract last part after second suffix, but if I have't second suffix - then default
Thanks for advice
Try it like this:
First you "split" the string on its minus signs with the XML trick.
Then you read the third node from you XML - voila!
CREATE TABLE #tbl(content VARCHAR(100));
INSERT INTO #tbl VALUES('W61N03D20V0-WHIH-ALT1')
,('W61N03D20V0-WHIH-SW')
,('W61N26D1YA1-VICU');
WITH SplittedAsXml AS
(
SELECT CAST('<x>' + REPLACE(content,'-','</x><x>') + '</x>' AS XML) AS Content
FROM #tbl
)
SELECT ISNULL(Content.value('/x[3]','varchar(max)'),'default') AS TheThirdPart
FROM SplittedAsXml;
DROP TABLE #tbl;
The result
ALT1
SW
default
Going this ways would also give you the chance to get the other parts in one go just querying /x[1] and /x[2] too
I did it using the built-in substring() function:
declare #str VARCHAR(40) = 'W61N03D20V0-WHIH-ALT1' -- also works for the other examples
declare #sep VARCHAR(1) = '-'
declare #middleToEnd VARCHAR(40) = substring(#str, charindex(#sep, #str) + 1, len(#str))
declare #pos INT = charindex(#sep, #middleToEnd)
declare #lastPart VARCHAR(40) =
CASE WHEN #pos = 0
THEN 'Default'
ELSE substring(#middleToEnd, #pos + 1, len(#middleToEnd))
END
select #lastPart
For best performance, you can solve it with this one-liner(calculation is one line)
SELECT
COALESCE(STUFF(col,1,NULLIF(CHARINDEX('-',col, CHARINDEX('-',col)+1), 0),''),'Default')
FROM (values
('W61N03D20V0-WHIH-ALT1'),('W61N03D20V0-WHIH-ALT2'),
('W61N03D20V0-WHIH-SW'),('W61N26D1YA1-VICU'),
('W61N27D21V2-AZTD')) x(col)
Result:
ALT1
ALT2
SW
Default
Default
If I understand what you are asking for, the following does what you need:
-- fake table
WITH SomeTable AS (
SELECT 'W61N03D20V0-WHIH-ALT1' AS Field1
UNION ALL
SELECT 'W61N03D20V0-WHIH-SW'
UNION ALL
SELECT 'W61N26D1YA1-VICU'
)
-- select
SELECT
CASE CHARINDEX('-WHIH-', Field1)
WHEN 0 THEN 'Default'
ELSE SUBSTRING(Field1, CHARINDEX('-WHIH-', Field1) + 6, LEN(Field1) - (CHARINDEX('-WHIH-', Field1) + 5))
END
FROM SomeTable
Use can use a CASE expression to check whether the string starts with W61N03D20V0-WHIH.
If it starts with it use a combination of RIGHT, REVERSE and CHARINDEX functions to get last part from the string, else Default.
Query
select case when [your_column_name] like 'W61N03D20V0-WHIH%'
then right([your_column_name], charindex('-', reverse([your_column_name]), 1) - 1)
else 'Default' end as new_column_name
from your_table_name;
SQl Fiddle demo
I have two type of strings in a column.
DECLARE #t table(parameter varchar(100))
INSERT #t values
('It contains eact01' ),
('It contains preact01')
I'm trying to get the strings that contain the word 'eact01'.
My problem is that using the following SELECT, I get also the variables that contain 'preact01', because it contain 'eact01'.
SELECT * FROM #t WHERE parameter LIKE '%eact01%'
How could I get only the row containing 'eact01'?
This should find all combinations, any character not being a letter or a number considerer this as a spit character or a new word.
SELECT *
FROM #t
WHERE
parameter like '%[^0-9a-z]eact01'
or parameter like '%[^0-9a-z]eact01[^0-9a-z]%'
or parameter like 'eact01[^0-9a-z]%'
or parameter = 'eact01'
Try this-
select *
from #t
where
parameter='eact01'
OR parameter like '%[^0-9a-z]eact01%'
OR parameter like 'eact01[^0-9a-z]%'
OR parameter like '%[^0-9a-z]eact01[^0-9a-z]%'
The easiest way is just add space:
SELECT * FROM #t WHERE parameter LIKE '% eact01%' or parameter LIKE 'eact01%'
You need a string splitter for this. Here is one taken from Aaron Bertrand's article:
CREATE FUNCTION dbo.SplitStrings_XML
(
#List NVARCHAR(MAX),
#Delimiter NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
SELECT Item = y.i.value('(./text())[1]', 'nvarchar(4000)')
FROM
(
SELECT x = CONVERT(XML, '<i>'
+ REPLACE(#List, #Delimiter, '</i><i>')
+ '</i>').query('.')
) AS a CROSS APPLY x.nodes('i') AS y(i)
);
Then, you can use EXISTS:
SELECT *
FROM #t
WHERE EXISTS(
SELECT 1
FROM dbo.SplitStrings_XML(parameter, ' ') s
WHERE s.Item = 'eact01'
)
I have T-SQL code and am researching how to split
Aruba\abc
Spain\defg
New Zealand\qwerty
Antartica\sporty
Such that the column outputs
abc
defg
qwerty
sporty
So far, I found something like this,
http://www.aspsnippets.com/Articles/Split-function-in-SQL-Server-Example-Function-to-Split-Comma-separated-Delimited-string-in-SQL-Server-2005-2008-and-2012.aspx
But it splits column based on delimiters into new columns.
I wish to keep the information AFTER the delimiter \
Please advise
SELECT RIGHT(ColName , LEN(ColName) - CHARINDEX('\', ColName) )
FROM TABLEName
OR
SELECT PARSENAME(REPLACE(ColName , '\' , '.'),1)
FROM TableName
If you have it as a variable example:
DECLARE #str VARCHAR(50) = 'aruba\abc'
SELECT SUBSTRING(#str,CHARINDEX('\', #str)+1, LEN(#str) - CHARINDEX('\', #str) )
If you have it in a table example:
SELECT SUBSTRING(column1,CHARINDEX('\', column1)+1, LEN(column1) - CHARINDEX('\', column1) )
FROM table1
Here's a sqlfiddle of it working : http://sqlfiddle.com/#!6/85de5/1