stripping strings between a comma in sql server - sql-server

I know to do parts of it but not all of it, lets say my table name is REV and column name is DESCR and it has a value like
R&B , Semiprivate 2 Beds , Medical/Surgical/GYN
i use
SELECT DESCR, LEFT(DESCR, Charindex(',', DESCR)), SUBSTRING(DESCR, CHARINDEX(',', DESCR) + 1, LEN(DESCR)) from REV
i get 'R&B ,' in one column and 'Semiprivate 2 Beds , Medical/Surgical/GYN' in another column in the above select statement but i dont know how to selesct the strings from teh second comma onwards
what i like to return is 'R&B' in one column without the comma and 'Semiprivate 2 Beds' in another column and 'Medical/Surgical/GYN' so on
basically select test between commas and when there is no comma it should be blank

This should work:
SELECT
LEFT(DESCR, CHARINDEX(',', DESCR)-1),
SUBSTRING(DESCR, CHARINDEX(',', DESCR)+1, CHARINDEX(',', DESCR, CHARINDEX(',', DESCR)+1) - CHARINDEX(',', DESCR) -1 ),
RIGHT(DESCR, CHARINDEX(',', REVERSE(DESCR))-1)
FROM REV

This should work:
SELECT
LEFT(DESCR, CHARINDEX(',', DESCR)-1),
SUBSTRING(DESCR, CHARINDEX(',', DESCR)+1, LEN(DESCR)-CHARINDEX(',', DESCR)-CHARINDEX(',',REVERSE(DESCR ))),
RIGHT(DESCR, CHARINDEX(',', REVERSE(DESCR))-1)
FROM REV
Sample SQL Fiddle
This will split the string, but leave blank at the beginning and end of the strings, you can use LTRIMand RTRIMto trim away the blanks.
There might be better ways to do this though; see the article Split strings the right way – or the next best way by Aaron Bertrand at (that Andrew mentioned in a comment).

Related

Keep only desired characters and separate with semicolon in T-SQL

The problem:
I have text data imported into the db with a lot of unwanted characters. I need to keep only 4 capital letter strings within the imported text string. Example:
1447;#MIBD (This is a nice name);#2056;#LKRE (Very nice name indeed)
this could be in one column in one row of my table. What I need to extract from the string is:
MIBD and LKRE
And the result should preferably be the desired strings separated with semicolons.
It should be applied to the whole column and I cannot know how many of these 4 upper case letter strings might appear in one row.
Went through all sorts of function like PATINDEX etc. but really do not know how to approach it. thanks for any help!
try this, it assumes that the four char code is always preceded by ;# . As PATINDEX is case insensitive I have added additional check to verify that all the four character are capital.
DECLARE #MyTable Table( ID INT, MyString VARCHAR(8000))
INSERT INTO #MyTable
VALUES
(1, '1447;#MIBD (This is a nice name);#2056;#LKRE (Very nice name indeed)')
,(2, ';#DBCC (This is a nice name);#2056;#LLC (Very nice name indeed) ;#ABCD')
,(3, ';#AaaA;#OPQR;1234 (and) ;#WXYZ')
,(4, ';#abc this empty string without any code')
;WITH CTE AS
(
SELECT ID
,SUBSTRING(MyString, PATINDEX('%;#[A-Z][A-Z][A-Z][A-Z]%',MyString)+2, 4) AS NewString
,STUFF(MyString, 1, PATINDEX('%;#[A-Z][A-Z][A-Z][A-Z]%',MyString)+6, '') AS MyString
FROM #MyTable m
WHERE PATINDEX('%;#[A-Z][A-Z][A-Z][A-Z]%',MyString) > 0
UNION ALL
SELECT ID
,SUBSTRING(MyString, PATINDEX('%;#[A-Z][A-Z][A-Z][A-Z]%',MyString)+2, 4) AS NewString
,STUFF(MyString, 1, PATINDEX('%;#[A-Z][A-Z][A-Z][A-Z]%',MyString)+6, '') AS MyString
FROM CTE c
WHERE PATINDEX('%;#[A-Z][A-Z][A-Z][A-Z]%',MyString) > 0
)
SELECT c.ID,
STUFF(( SELECT '; ' + NewString
FROM CTE c1
WHERE c1.ID = c.ID
AND ASCII(SUBSTRING(NewString, 1, 1)) BETWEEN ASCII('A') AND ASCII('Z') -- first char
AND ASCII(SUBSTRING(NewString, 2, 1)) BETWEEN ASCII('A') AND ASCII('Z') -- second char
AND ASCII(SUBSTRING(NewString, 3, 1)) BETWEEN ASCII('A') AND ASCII('Z') -- third char
AND ASCII(SUBSTRING(NewString, 4, 1)) BETWEEN ASCII('A') AND ASCII('Z') -- fourth char
FOR XML PATH(''), TYPE).value('.', 'VARCHAR(MAX)') -- use the value clause to hanlde xml character issue like, &,",>,<
,1,1,'') AS CodeList
FROM CTE c
GROUP BY ID
OPTION (MAXRECURSION 0);
I came to something like this so far:
ALTER FUNCTION CleanData
(
-- Parameters here
#Text AS VARCHAR(4000)
)
RETURNS VARCHAR(4000)
AS
BEGIN
WHILE PATINDEX('%[0-9#;()]%', #Text) > 0
BEGIN
SET #Text = STUFF(#Text, PATINDEX('%[0-9#;()]%', #Text), 1, '')
END
RETURN #Text
END
But what I get is the Initials and the characters in parantheses as the PATINDEX cannot differ between the upper and lower case. Maybe it might be helpful for somebody else

sql to check string contains with where clause

in Sql server
I have a following string
DECLARE #str nvarchar(max);
set #str = "Hello how are you doing today,Its Monday and 5 waiting days";
DECLARE #srch nvarchar(max);
set #srch = " how,doing,monday,waiting";
Now i want to check whether str contains any of string (comma separated string) of srch
I want it in only sql server
is there possibilites to write some query with in clause
like
select from #str where _____ in (select * from CommaSplit(#srch)
where CommaSplit function rerturns rows of #srch comma separted value
I dont want to use cursor or any loop concept as the #srch value can be very long
Thanks
you can use same function to get first string in rows
select string from CommaSplit(#srch,'') where string in (select * from CommaSplit(#srch)
You can use the following common table expressions query to split your string into parts. cte will contain one record per phrase in #srch. In my example below, I show where in #str each of the search phrase is located. It returns 0 if it cannot locate a search phrase.
Note 1: it won't show the location twice if your search phrase is duplicated - you would need another CTE for that.
Note 2: I have to add comma at the end of #srch to make my CTE work. You can do that inside the CTE if you prefer not to change the search string.
DECLARE #str nvarchar(max);
set #str = 'Hello how are you doing today,Its Monday and 5 waiting days';
DECLARE #srch nvarchar(max);
set #srch = 'how,doing,monday,waiting';
set #srch = #srch + ','
-- first split the text into 1 character per row
;with cte
as
(
select substring(#srch, 1, CHARINDEX(',', #srch, 1) - 1) as Phrase, CHARINDEX(',', #srch, 1) as Idx
union all
select substring(#srch, cte.Idx + 1, CHARINDEX(',', #srch, cte.Idx + 1) - cte.Idx - 1) as Phrase, CHARINDEX(',', #srch, cte.Idx + 1) as Idx
from cte
where cte.Idx < CHARINDEX(',', #srch, cte.Idx + 1)
)
select charindex(cte.Phrase, #str, 1) from cte
I don't think that the IN clause is what you need. Instead of this you can use the LIKE construction as following:
if (select count(*) from CommaSplit(#srch) where #str like '%' + val + '%') > 0
select 'true'
else
select 'false'
In this case you will receive 'true' when at least 1 result of CommaSplit function exists in the #str text. But in this case you also will receive a 'true' value when the result of CommaSplit function is a part of the word in the #str string.
If you need more accurate solution, this can be achieved by the following way: you need to split the #str into the words (also replacing punctuation by spaces beforehand). And, after this, intersect of CommaSplit (#srch) and SpaceSplit(#str) will be the answer on the question. Among this, you also will be able to check which words are matching between two strings.
The overhead of this method is to create function SpaceSplit which is copy of CommaSplit but with another separator. Or the function CommaSplit can be modified to receive a separator as parameter.

Parsing through a column to flip names using patindex

So I have a database of customers. I run SELECT * FROM MyTable it gives me back several columns, one of which is the name. Looks like this:
"Doe, John"
"Jones, Bill"
"Smith, Mike"
"Johnson, Bob"
"Harry Smith"
"Black, Linda"
"White, Laura"
etc. Some are last name, first name. Others are first name last name.
My boss wants me to flip the names so they are all first then last.
So I ran this:
SELECT SUBSTRING(Column_Name, CHARINDEX(',', Column_Name) + 2, LEN(Name) - CHARINDEX(',', Column_Name) + 1) + ' ' + SUBSTRING(Column_Name, 1, CHARINDEX(',', Column_Name) - 1) FROM MyTable
The problem is that when I run that, it only runs the names until it finds one it doesn't need to flip. So in the example above, it would only give me the first four names, not all of them.
It was suggested to me that I could use the PATINDEX() to pull out all of the names. I don't know how to use this and was hoping I could get some help with it.
I suspect your code has TRY/CATCH or you are otherwise swallowing/suppressing/ignoring errors. You should get 4 rows back and then a big ugly error message:
Msg 537, Level 16, State 2
Invalid length parameter passed to the LEFT or SUBSTRING function.
The problem is that your expression assumes that , always exists. You need to cater for that either by filtering out the rows that don't contain a , (though this is not very dependable, since the expression could be attempted before the filter), or the following way, where you make different decisions about how to reassemble the string based on whether a , is found or not:
DECLARE #x TABLE(y VARCHAR(255));
INSERT #x VALUES
('Doe, John'),
('Jones, Bill'),
('Smith, Mike'),
('Johnson, Bob'),
('Harry Smith'),
('Black, Linda'),
('White, Laura');
SELECT LTRIM(SUBSTRING(y, COALESCE(NULLIF(CHARINDEX(',',y)+2,2),1),255))
+ RTRIM(' ' + LEFT(y, COALESCE(NULLIF(CHARINDEX(',' ,y)-1,-1),0)))
FROM #x;
Results:
John Doe
Bill Jones
Mike Smith
Bob Johnson
Harry Smith
Linda Black
Laura White
You don't need PATINDEX in this case, although it could be used. I'd take your expression to flip the names and put it in a CASE expression.
DECLARE #MyTable TABLE
(
Name VARCHAR(64) NOT NULL
);
INSERT #MyTable(Name)
VALUES
('Doe, John'),
('Jones, Bill'),
('Smith, Mike'),
('Johnson, Bob'),
('Harry Smith'),
('Black, Linda'),
('White, Laura');
SELECT
CASE
WHEN CHARINDEX(',', Name, 1) = 0 THEN Name
ELSE SUBSTRING(Name, CHARINDEX(',', Name) + 2, LEN(Name) - CHARINDEX(',', Name) + 1)
+ ' ' + SUBSTRING(Name, 1, CHARINDEX(',', Name) - 1)
END AS [Name]
FROM #MyTable;
The first condition simply returns the original value if no comma was used.

select data up to a space?

I have an MSSQL database field that looks like the examples below:
u129 james
u300 chris
u300a jim
u202 jane
u5 brian
u5z brian2
Is there a way to select the first set of characters? Basically select all the characters up until the first line space?
I tried messing around with LEFT, RIGHT, LEN, but couldn't figure out a way to do it with variable string lengths like in my example.
Thanks!
You can use a combiation of LEFT and CHARINDEX to find the index of the first space, and then grab everything to the left of that.
SELECT LEFT(YourColumn, charindex(' ', YourColumn) - 1)
And in case any of your columns don't have a space in them:
SELECT LEFT(YourColumn, CASE WHEN charindex(' ', YourColumn) = 0 THEN
LEN(YourColumn) ELSE charindex(' ', YourColumn) - 1 END)
select left(col, charindex(' ', col) - 1)
If the first column is always the same size (including the spaces), then you can just take those characters (via LEFT) and clean up the spaces (with RTRIM):
SELECT RTRIM(LEFT(YourColumn, YourColumnSize))
Alternatively, you can extract the second (or third, etc.) column (using SUBSTRING):
SELECT RTRIM(SUBSTRING(YourColumn, PreviousColumnSizes, YourColumnSize))
One benefit of this approach (especially if YourColumn is the result of a computation) is that YourColumn is only specified once.
An alternative if you sometimes do not have spaces do not want to use the CASE statement
select REVERSE(RIGHT(REVERSE(YourColumn), LEN(YourColumn) - CHARINDEX(' ', REVERSE(YourColumn))))
This works in SQL Server, and according to my searching MySQL has the same functions
If space is missing, you can add one
SELECT LEFT('YourTextOrColumn',
charindex(' ',
'YourTextOrColumn' + ' ') - 1 )

Get substring in SQL Server

I want to get a substring in SQL Server from last sequence of a split on dot (.).
I have a column which contains file names such as hello.exe, and I want to find the extension of the file exactly as Path.GetExtension("filename") does in C#.
You can use reverse along with substring and charindex to get what you're looking for:
select
reverse(substring(reverse(filename), 1,
charindex('.', reverse(filename))-1)) as FileExt
from
mytable
This holds up, even if you have multiple . in your file (e.g.-hello.world.exe will return exe).
So I was playing around a bit with this, and this is another way (only one call to reverse):
select
SUBSTRING(filename,
LEN(filename)-(CHARINDEX('.', reverse(filename))-2), 8000) as FileExt
from
mytable
This calculates 10,000,000 rows in 25 seconds versus 29 seconds for the former method.
DECLARE #originalstring VARCHAR(100)
SET #originalstring = 'hello.exe'
DECLARE #extension VARCHAR(50)
SET #extension = SUBSTRING(#originalstring, CHARINDEX('.', #originalstring) + 1, 999)
SELECT #extension
That should do it, I hope! This works as long as you only have a single '.' in your file name - separating the file name from the extension.
Marc
Try this
SELECT RIGHT(
'C:\SomeRandomFile\Filename.dat',
CHARINDEX(
'.',
REVERSE(
'C:\SomeRandomFile\Filename.dat'
),
0)
-1)
Same as accepted answer, but I've added a condition to avoid error when filename is null or when filename has no extension (no point):
select
reverse(substring(reverse(filename), 1,
charindex('.', reverse(filename))-1)) as FileExt
from
mytable
where
filename is not null
and charindex('.',filename) > 0
The following SQL request adressed most of the edge cases in my weird database where many files didn't have extensions.
select distinct reverse(left(reverse(fileNameWithExtension), charindex('.', reverse(fileNameWithExtension)) - 1))
from myTable
where charindex('.', reverse(fileNameWithExtension)) - 1 > 0 and charindex('.', reverse(fileNameWithExtension)) - 1 < 7 and fileNameWithExtension is not null

Resources