I have these special characters " ||~|| " at the end of each value in column X. I need to remove these special characters.
Right now I am using this syntax, but it doesn't seem to accomplish the task for all rows.
set [Customer Num] = substring(ltrim(rtrim([Customer Num])),3,len([Customer Num]))
Try this options,
Declare #myStr varchar(50) = 'amol~'
--If want to remove char ~ of any position
Select REPLACE(#myStr,'~','')
Set #myStr = '~amol~'
Select REPLACE(#myStr,'~','')
Set #myStr = '~am~ol~'
Select REPLACE(#myStr,'~','')
--If want to remove character ~ at Last position & existance of char ~ is inconsistent
Set #myStr ='amol~'
Select Case When RIGHT(#myStr,1) = '~'
Then LEFT(#myStr,len(#myStr) - 1)
Else #myStr
End
If you are looking to replace ||~|| Then try this,
Declare #myStr varchar(50) = 'amol ||~|| '
--If want to remove string ||~| of any position
Select REPLACE(#myStr,'||~||','')
Set #myStr = '||~||amol||~||'
Select REPLACE(#myStr,'||~||','')
Set #myStr = '||~||am||~||ol||~||'
Select REPLACE(#myStr,'||~||','')
--If want to remove string ||~| at Last position & existance of char ||~| is inconsistent
Set #myStr ='amol||~||'
Select Case When RIGHT(#myStr,5) = '||~||'
Then LEFT(#myStr,len(#myStr) - 5)
Else #myStr
End
If you know for sure that your values end with the Special String, try
substring ( [Customer Num], 1, length([Customer Num]) - length(' ||~|| ') )
It's better, however, to safeguard against accidental Deletions:
substring (
[Customer Num]
, 1
, Case coalesce(substr( [Customer Num], length([Customer Num]) - length(' ||~|| '), '_' )
When ' ||~|| ' then length([Customer Num]) - length(' ||~|| ')
Else length([Customer Num])
End
)
If your rdbms Supports regular expressions, this simplifies to (using Oracle Syntax)
Regexp_replace ( [Customer Num], ' \|\|~\|\| $', '')
Assuming you have to remove last 3 characters of ColumnX
set [ColumnX] = substring(ltrim(rtrim([ColumnX])),0,len([ColumnX]) - 3)
This works
update [Table] set [Customer Num] = (substring(ltrim(rtrim([Customer Num])),0,len([Customer Num]) - 3))
where [Customer Num] like '%only text containing this string%'
Related
I need to split a variable as, exp
declare #testString varchar(100)
set #testString = ' Agency=100|Org=2112|RepOrg=2112|SubOrg= |Fund=0137|Approp=6755|Object= |SubObject= |Activity= |Function= |Job= |ReportingCat= '
select
y.items
from
dbo.Split(#testString, '|') x
cross apply
dbo.Split(x.items, '=') y
Leads to error :
Msg 102, Level 15, State 1, Line 7
Incorrect syntax near '.'.
Not sure where I'm going wrong.
May be you need something like this:-
DECLARE #testString VARCHAR(100)
SET #testString =
' Agency=100|Org=2112|RepOrg=2112|SubOrg= |Fund=0137|Approp=6755|Object= |SubObject= |Activity= |Function= |Job= |ReportingCat= '
SELECT X.VALUE AS ACTUALVALUE,
SUBSTRING(
X.VALUE,
1,
CASE
WHEN CHARINDEX('=', X.VALUE) = 0 THEN LEN(X.VALUE)
ELSE CHARINDEX('=', X.VALUE) -1
END
) AS FIELD,
SUBSTRING(X.VALUE, CHARINDEX('=', X.VALUE) + 1, 10) AS VALUE
FROM string_split(#testString, '|') x
I have used the same function which you have used dbo.split. To get the output (Agency in one column and code in another), you can make use of substring along with char index which will help you to split into two columns.
Few changes I made to your script:
Changed the length from 100 to 250 as it was truncating the string, and
removed another cross apply as it was creating duplicates.
declare #testString varchar(250)
set #testString = 'Agency=100|Org=2112|RepOrg=2112|SubOrg=
|Fund=0137|Approp=6755|Object= |SubObject= |Activity= |Function= |Job= |ReportingCat='
select substring( (x.items),1,
case when CHARINDEX('=', x.items) = 0 then LEN(x.items)
else CHARINDEX('=', x.items) -1 end ) Agency ,
substring( (x.items),
case when CHARINDEX('=', x.items) = 0 then LEN(x.items)
else CHARINDEX('=', x.items) +1 end,len(x.items) -
case when CHARINDEX('=', x.items) = 0 then LEN(x.items)
else CHARINDEX('=', x.items)-1 end) as Code from dbo.split
(#testString, '|') x
It ran without error, and that function is here as Ben mentioned.
https://social.msdn.microsoft.com/Forums/en-US/bb2b2421-6587-4956-aff0-a7df9c91a84a/what-is-dbosplit?forum=transactsql
Output which I get:
Agency Code
Agency 100
Org 2112
RepOrg 2112
SubOrg
Fund 0137
Approp 6755
Object
SubObject
Activity
Function
Job
ReportingCat
I have a string like this:
Apple
I want to include a separator after each character so the end result will turn out like this:
A,p,p,l,e
In C#, we have one liner method to achieve the above with Regex.Replace('Apple', ".{1}", "$0,");
I can only think of looping each character with charindex to append the separator but seems a little complicated. Is there any elegant way and simpler way to achieve this?
Thanks HABO for the suggestions. I'm able to generate the result that I want using the code but takes a little bit of time to really understand how the code work.
After some searching, I manage to found one useful article to insert empty spaces between each character and it's easier for me to understand.
I modify the code a little to define and include desire separator instead of fixing it to space as the separator:
DECLARE #pos INT = 2 -- location where we want first space
DECLARE #result VARCHAR(100) = 'Apple'
DECLARE #separator nvarchar(5) = ','
WHILE #pos < LEN(#result)+1
BEGIN
SET #result = STUFF(#result, #pos, 0, #separator);
SET #pos = #pos+2;
END
select #result; -- Output: A,p,p,l,e
Reference
In following SQL scripts, I get each character using SUBSTRING() function using with a number table (basically I used spt_values view here for simplicity) and then I concatenate them via two different methods, you can choose one
If you are using SQL Server 2017, we have a new SQL string aggregation function
First script uses string_agg function
declare #str nvarchar(max) = 'Apple'
SELECT
string_agg( substring(#str,number,1) , ',') Within Group (Order By number)
FROM master..spt_values n
WHERE
Type = 'P' and
Number between 1 and len(#str)
If you are working with a previous version, you can use string concatenation using FOR XML Path and SQL Stuff function as follows
declare #str nvarchar(max) = 'Apple'
; with cte as (
SELECT
number,
substring(#str,number,1) as L
FROM master..spt_values n
WHERE
Type = 'P' and
Number between 1 and len(#str)
)
SELECT
STUFF(
(
SELECT
',' + L
FROM cte
order by number
FOR XML PATH('')
), 1, 1, ''
)
Both solution yields the same result, I hope it helps
If you have SQL Server 2017 and a copy of ngrams8k it's ultra simple:
declare #word varchar(100) = 'apple';
select newString = string_agg(token, ',') within group (order by position)
from dbo.ngrams8k(#word,1);
For pre-2017 systems it's almost as simple:
declare #word varchar(100) = 'apple';
select newstring =
( select token + case len(#word)+1-position when 1 then '' else ',' end
from dbo.ngrams8k(#word,1)
order by position
for xml path(''))
One ugly way to do it is to split the string into characters, ideally using a numbers table, and reassemble it with the desired separator.
A less efficient implementation uses recursion in a CTE to split the characters and insert the separator between pairs of characters as it goes:
declare #Sample as VarChar(20) = 'Apple';
declare #Separator as Char = ',';
with Characters as (
select 1 as Position, Substring( #Sample, 1, 1 ) as Character
union all
select Position + 1,
case when Position & 1 = 1 then #Separator else Substring( #Sample, Position / 2 + 1, 1 ) end
from Characters
where Position < 2 * Len( #Sample ) - 1 )
select Stuff( ( select Character + '' from Characters order by Position for XML Path( '' ) ), 1, 0, '' ) as Result;
You can replace the select Stuff... line with select * from Characters; to see what's going on.
Try this
declare #var varchar(50) ='Apple'
;WITH CTE
AS
(
SELECT
SeqNo = 1,
MyStr = #var,
OpStr = CAST('' AS VARCHAR(50))
UNION ALL
SELECT
SeqNo = SeqNo+1,
MyStr = MyStR,
OpStr = CAST(ISNULL(OpStr,'')+SUBSTRING(MyStR,SeqNo,1)+',' AS VARCHAR(50))
FROM CTE
WHERE SeqNo <= LEN(#var)
)
SELECT
OpStr = LEFT(OpStr,LEN(OpStr)-1)
FROM CTE
WHERE SeqNo = LEN(#Var)+1
The situation is as follows:
We have action logs in our database triggered by user events, that saves the events in varchar but in xml format. In some cases the name of the attributes contains spaces like this one:
<UNITDETAILUPDATE NEWUNIT TYPE="DUW 30 01" OLDFAULT_CIRC="HWS" NEWFAULT_CIRC="HWS" OLDOUTGOING R-STATE="R3C" />
I would like to eliminate the spaces from the names of the attributes before parsing to xml(because this way it is not possible of course :))
As you can see there are multiple occurences in the string. A great solution would be something like only replacing the spaces where there is no " character before them, but I have no idea how to achieve this.
Any ideas?
Thank you :)
For a high-performing set-based solution you can grab a copy of ngrams8k and do this:
DECLARE #string varchar(1000) = '<UNITDETAILUPDATE NEWUNIT TYPE="DUW 30 01" OLDFAULT_CIRC="HWS" NEWFAULT_CIRC="HWS" OLDOUTGOING R-STATE="R3C" />';
select newString =
(
select
case when token = ' ' and position > space1 and isQuoted = 0 and p.c <> '"'
then '' else token end
from
(
select ng.*, sum(case when token = '"' then 1 else 0 end) over (order by position)%2
from dbo.ngrams8k(#string, 1) ng
) x(position, token, isQuoted)
cross join (values (charindex(' ', #string))) v(space1)
cross apply (values (substring(#string, position-1,1))) p(c)
order by position
for xml path(''), type
).value('(text())[1]', 'varchar(8000)');
Results
<UNITDETAILUPDATE NEWUNITTYPE="DUW 30 01" OLDFAULT_CIRC="HWS" NEWFAULT_CIRC="HWS" OLDOUTGOINGR-STATE="R3C" />
If you have a SQL Server 2017 you can use string_agg like with ngrams8k like this:
select newString = string_agg(
case when token = ' ' and position > space1 and isQuoted = 0
and substring(#string, position-1,1) <> '"' then '' else token end,'')
from
(
select ng.*, sum(case when token = '"' then 1 else 0 end) over (order by position)%2
from dbo.ngrams8k(#string, 1) ng
) x(position, token, isQuoted)
cross join (values (charindex(' ', #string))) v(space1)
cross apply (values (substring(#string, position-1,1))) p(c);
You could search for good spaces and save them with a placeholder
Declare #var varchar(100) = '<UNITDETAILUPDATE NEWUNIT TYPE="DUW 30 01" OLDFAULT_CIRC="HWS" NEWFAULT_CIRC="HWS" OLDOUTGOING R-STATE="R3C" />'
Select #var = replace(#var,'" ','"|")
Then remove the spaces
Select #var = replace(#var,' ','_')
Then put the good spaces back
Select #var = replace(replace(#var,'|',' '),'UNITDETAILUPDATE_','UNITDETAILUPDATE ')
This could be combined into one ugly replace so that it could be selected across a table. You would probably need to placehold the spaces inside the quotations. Regex is not supported in SQL but sometimes it could be used with 'like'
This "Xml" is awfully bad...
The following approach won't be fast. If you need this more often, you might use another language or tool.
This solutions uses a recursive CTE, which is a hidden RBAR, to build ab the string again, charachter by character, checking for "within quotes":
DECLARE #BadXml NVARCHAR(MAX)='<UNITDETAILUPDATE NEWUNIT TYPE="DUW 30 01" OLDFAULT_CIRC="HWS" NEWFAULT_CIRC="HWS" OLDOUTGOING R-STATE="R3C" />';
WITH recCTE
AS
(
SELECT LTRIM(RTRIM(REPLACE(#BadXml,'" ','"$'))) AS TheString
,1 AS CurrentPos
,CAST('<' AS NVARCHAR(MAX)) AS BuildNew
,-1 AS IsFirstBlank
,-1 AS QuotOpen
UNION ALL
SELECT r.TheString
,r.CurrentPos+1
,r.BuildNew + CASE WHEN chr=' ' AND r.IsFirstBlank=1 AND r.QuotOpen=-1 THEN '_' ELSE chr END
,CASE WHEN r.IsFirstBlank=-1 AND chr=' ' THEN 1 ELSE r.IsFirstBlank END
,CASE WHEN chr='"' THEN r.QuotOpen * (-1) ELSE r.QuotOpen END
FROM recCTE AS r
CROSS APPLY(SELECT SUBSTRING(r.TheString,r.CurrentPos+1,1)) AS A(chr)
WHERE r.CurrentPos<LEN(r.TheString)
)
SELECT TOP 1 IsFirstBlank,QuotOpen, CAST(REPLACE(BuildNew,'"$','" ') AS XML) AS TheXml
FROM recCTE
ORDER BY LEN(BuildNew) DESC
OPTION (MAXRECURSION 1000)
The result
IsFirstBlank QuotOpen TheXml
1 -1 <UNITDETAILUPDATE NEWUNIT_TYPE="DUW 30 01" OLDFAULT_CIRC="HWS" NEWFAULT_CIRC="HWS" OLDOUTGOING_R-STATE="R3C" />
Take away the CAST to xml, the TOP 1 and the ORDER BY to see how it works.
I have a table and it has a 3 columns. The first column is the data that contains value(numeric) and unit(percentage and etc..), the second column is numeric column, the third is Unit column. What I want to do is split the numeric and the unit from the first column then put those split-ted data to its designated column.
Here is my table:
I tried this function:SO link here..., it really does splitting alpha and numeric but then I'm new in using SQL Function, my problem there is the parameter must be in string STRING, so what I did is change it to Sub Query but it gives me error.
Sample COde:
SQL FUNCTION:
create function [dbo].[GetNumbersFromText](#String varchar(2000))
returns table as return
(
with C as
(
select cast(substring(S.Value, S1.Pos, S2.L) as int) as Number,
stuff(s.Value, 1, S1.Pos + S2.L, '') as Value
from (select #String+' ') as S(Value)
cross apply (select patindex('%[0-9]%', S.Value)) as S1(Pos)
cross apply (select patindex('%[^0-9]%', stuff(S.Value, 1, S1.Pos, ''))) as S2(L)
union all
select cast(substring(S.Value, S1.Pos, S2.L) as int),
stuff(S.Value, 1, S1.Pos + S2.L, '')
from C as S
cross apply (select patindex('%[0-9]%', S.Value)) as S1(Pos)
cross apply (select patindex('%[^0-9]%', stuff(S.Value, 1, S1.Pos, ''))) as S2(L)
where patindex('%[0-9]%', S.Value) > 0
)
select Number
from C
)
SELECT STATEMENT with SUB Query:
declare #S varchar(max)
select number from GetNumbersFromText(Select SomeColm From Table_Name) option (maxrecursion 0)
BTW, im using sql server 2005.
Thanks!
If the numeric part is always at the beginning, then you can use this:
PATINDEX('%[0-9][^0-9]%', ConcUnit)
to get the index of the last digit.
Thus, this:
DECLARE #str VARCHAR(MAX) = '4000 ug/ML'
SELECT LEFT(#str, PATINDEX('%[0-9][^0-9]%', #str )) AS Number,
LTRIM(RIGHT(#str, LEN(#str) - PATINDEX('%[0-9][^0-9]%', #str ))) As Unit
gives you:
Number Unit
-------------
4000 ug/ML
EDIT:
If numeric data include double values as well, then you can use this:
SELECT LEN(#str) - PATINDEX ('%[^0-9][0-9]%', REVERSE(#str))
to get the index of the last digit.
Thus, this:
SELECT LEFT(#str, LEN(#str) - PATINDEX ('%[^0-9][0-9]%', REVERSE(#str)))
gives you the numeric part.
And this:
SELECT LEFT(#str, LEN(#str) - PATINDEX ('%[^0-9][0-9]%', REVERSE(#str))) AS Numeric,
CASE
WHEN CHARINDEX ('%', #str) <> 0 THEN LTRIM(RIGHT(#str, LEN(#str) - CHARINDEX ('%', #str)))
ELSE LTRIM(RIGHT(#str, PATINDEX ('%[^0-9][0-9]%', REVERSE(#str))))
END AS Unit
gives you both numberic and unit part.
Here are some tests that I made with the data you have posted:
Input:
DECLARE #str VARCHAR(MAX) = '50 000ug/ML'
Output:
Numeric Unit
------------
50 000 ug/ML
Input:
DECLARE #str VARCHAR(MAX) = '99.5%'
Output:
Numeric Unit
------------
99.5
Input:
DECLARE #str VARCHAR(MAX) = '4000 . 35 % ug/ML'
Output:
Numeric Unit
------------------
4000 . 35 ug/ML
Here is my answer. Check output in SQLFiddle for the same.
create TABLE temp
(
string NVARCHAR(50)
)
INSERT INTO temp (string)
VALUES
('4000 ug\ml'),
('2000 ug\ml'),
('%'),
('ug\ml')
SELECT subsrtunit,LEFT(subsrtnumeric, PATINDEX('%[^0-9]%', subsrtnumeric+'t') - 1)
FROM (
SELECT subsrtunit = SUBSTRING(string, posofchar, LEN(string)),
subsrtnumeric = SUBSTRING(string, posofnumber, LEN(string))
FROM (
SELECT string, posofchar = PATINDEX('%[^0-9]%', string),
posofnumber = PATINDEX('%[0-9]%', string)
FROM temp
) d
) t
Updated Version to handle 99.5 ug\ml
create TABLE temp
(
string NVARCHAR(50)
)
INSERT INTO temp (string)
VALUES
('4000 ug\ml'),
('2000 ug\ml'),
('%'),
('ug\ml'),
('99.5 ug\ml')
SELECT subsrtunit,LEFT(subsrtnumeric, PATINDEX('%[^0-9.]%', subsrtnumeric+'t') - 1)
FROM (
SELECT subsrtunit = SUBSTRING(string, posofchar, LEN(string)),
subsrtnumeric = SUBSTRING(string, posofnumber, LEN(string))
FROM (
SELECT string, posofchar = PATINDEX('%[^0-9.]%', string),
posofnumber = PATINDEX('%[0-9.]%', string)
FROM temp
) d
) t
Updated Version: To handle 1 000 ug\ml,20 000ug\ml
create TABLE temp
(
string NVARCHAR(50)
)
INSERT INTO temp (string)
VALUES
('4000 ug\ml'),
('2000 ug\ml'),
('%'),
('ug\ml'),
('99.5 ug\ml'),
('1 000 ug\ml'),
('20 000ug\ml')
SELECT substring(replace(subsrtunit,' ',''),PATINDEX('%[0-9.]%', replace(subsrtunit,' ',''))+1,len(subsrtunit)),
LEFT(replace(subsrtnumeric,' ',''), PATINDEX('%[^0-9.]%', replace(subsrtnumeric,' ','')+'t') - 1)
FROM (
SELECT subsrtunit = SUBSTRING(string, posofchar, LEN(string)),
subsrtnumeric = SUBSTRING(string, posofnumber, LEN(string))
FROM (
SELECT string, posofchar = PATINDEX('%[^0-9.]%', replace(string,' ','')),
posofnumber = PATINDEX('%[0-9.]%', replace(string,' ',''))
FROM temp
) d
) t
Check out SQLFiddle for the same.
Would something like this work? Based on the shown data it looks like it would.
Apply it to your data set as a select and if you like the results then you can make an update from it.
WITH cte as (SELECT 'ug/mL' ConcUnit, 500 as [Numeric], '' as Unit
UNION ALL SELECT '2000 ug/mL', NULL, '')
SELECT
[ConcUnit] as [ConcUnit],
[Numeric] as [Original Numeric],
[Unit] as [Original Unit],
CASE WHEN ConcUnit LIKE '% %' THEN
SUBSTRING(ConcUnit, 1, CHARINDEX(' ', ConcUnit) - 1)
ELSE [Numeric] END as [New Numeric],
CASE WHEN ConcUnit LIKE '% %'
THEN SUBSTRING(ConcUnit, CHARINDEX(' ', ConcUnit) + 1, LEN(ConcUnit))
ELSE ConcUnit END as [New Unit]
FROM cte
change #concunit & #unitx Respectively
DECLARE #concunit varchar(10)='45.5%'
DECLARE #unitx varchar(10)='%'
BEGIN
SELECT RTRIM(SUBSTRING( #concunit , 1 , CHARINDEX( #unitx , #concunit
) - 1
)) AS Number,
RTRIM(SUBSTRING( #concunit , CHARINDEX( #unitx , #concunit
) , LEN( #concunit
) - (CHARINDEX( #unitx , #concunit
) - 1)
)) AS Unit
end
I had the same dilemma, but in my case the alpha's were in front of the numerics.
So using the logic that #Giorgos Betsos added to his answer, I just reversed it.
I.e., when your input is :
abc123
You can split it like this:
declare #input varchar(30) = 'abc123'
select
replace(#input,reverse(LEFT(reverse(#input), PATINDEX('%[0-9][^0-9]%', reverse(#input) ))),'') Alpha
, reverse(LEFT(reverse(#input), PATINDEX('%[0-9][^0-9]%', reverse(#input) ))) Numeric
Results :
Is there a method to use contain rather than equal in case statement?
For example, I am checking a database table has an entry
lactulose, Lasix (furosemide), oxazepam, propranolol, rabeprazole, sertraline,
Can I use
CASE When dbo.Table.Column = 'lactulose' Then 'BP Medication' ELSE '' END AS 'BP Medication'
This did not work.
CASE WHEN ', ' + dbo.Table.Column +',' LIKE '%, lactulose,%'
THEN 'BP Medication' ELSE '' END AS [BP Medication]
The leading ', ' and trailing ',' are added so that you can handle the match regardless of where it is in the string (first entry, last entry, or anywhere in between).
That said, why are you storing data you want to search on as a comma-separated string? This violates all kinds of forms and best practices. You should consider normalizing your schema.
In addition: don't use 'single quotes' as identifier delimiters; this syntax is deprecated. Use [square brackets] (preferred) or "double quotes" if you must. See "string literals as column aliases" here: http://msdn.microsoft.com/en-us/library/bb510662%28SQL.100%29.aspx
EDIT If you have multiple values, you can do this (you can't short-hand this with the other CASE syntax variant or by using something like IN()):
CASE
WHEN ', ' + dbo.Table.Column +',' LIKE '%, lactulose,%'
WHEN ', ' + dbo.Table.Column +',' LIKE '%, amlodipine,%'
THEN 'BP Medication' ELSE '' END AS [BP Medication]
If you have more values, it might be worthwhile to use a split function, e.g.
USE tempdb;
GO
CREATE FUNCTION dbo.SplitStrings(#List NVARCHAR(MAX))
RETURNS TABLE
AS
RETURN ( SELECT DISTINCT Item FROM
( SELECT Item = x.i.value('(./text())[1]', 'nvarchar(max)')
FROM ( SELECT [XML] = CONVERT(XML, '<i>'
+ REPLACE(#List,',', '</i><i>') + '</i>').query('.')
) AS a CROSS APPLY [XML].nodes('i') AS x(i) ) AS y
WHERE Item IS NOT NULL
);
GO
CREATE TABLE dbo.[Table](ID INT, [Column] VARCHAR(255));
GO
INSERT dbo.[Table] VALUES
(1,'lactulose, Lasix (furosemide), oxazepam, propranolol, rabeprazole, sertraline,'),
(2,'lactulite, Lasix (furosemide), lactulose, propranolol, rabeprazole, sertraline,'),
(3,'lactulite, Lasix (furosemide), oxazepam, propranolol, rabeprazole, sertraline,'),
(4,'lactulite, Lasix (furosemide), lactulose, amlodipine, rabeprazole, sertraline,');
SELECT t.ID
FROM dbo.[Table] AS t
INNER JOIN dbo.SplitStrings('lactulose,amlodipine') AS s
ON ', ' + t.[Column] + ',' LIKE '%, ' + s.Item + ',%'
GROUP BY t.ID;
GO
Results:
ID
----
1
2
4
Pseudo code, something like:
CASE
When CHARINDEX('lactulose', dbo.Table.Column) > 0 Then 'BP Medication'
ELSE ''
END AS 'Medication Type'
This does not care where the keyword is found in the list and avoids depending on formatting of spaces and commas.