My string comparison doesn't work with hidden ascii characters - sql-server

I'm trying to compare the concatenation of two strings like this:
SELECT TestChemicalName, ResultChemicalName,
CASE
WHEN LAB_TestChemicalName + LAB_ResultChemicalName = TestChemicalName + ResultChemicalName THEN NULL
WHEN LAB_TestChemicalName + LAB_ResultChemicalName <> TestChemicalName + ResultChemicalName THEN LAB_TestChemicalName + ' ' + LAB_ResultChemicalName
ELSE NULL
END AS FinalElementName
FROM dbo.chemicalTraceTesting
If LAB_TestChemicalName + LAB_ResultChemicalName is the same/equals TestChemicalName + ResultChemicalName, then I want to return NULL.
However, if they are not equal, I want to return it as LAB_TestChemicalName + ' ' + LAB_ResultChemicalName
90% of the time this works, but if there are hidden ascii encoding symbols, like if someone just did a copy and paste from HTML, or Word or Excel, it will sometimes have strange characters. Then my query above won't catch that.
Is there a better way to compare two strings?
Thanks!

With bad data you're never going to get a reliable solution. The best you can do is some heursitic that is good enough most of the time.
What you need to do is compute a hash of the string that has the property that if two strings have the same hash then you consider them to be equal.
Maybe something like.
CREATE FUNCTION Slap(#source nvarchar(max)) RETURNS varchar(max)
AS
BEGIN
DECLARE #hash varchar(max)
;
WITH cteC AS (
SELECT 0 AS I, SUBSTRING(#source, 0, 1) AS C
UNION ALL
SELECT I + 1, SUBSTRING(#Source, I, 1) AS c FROM cteC WHERE I <= LEN(#source)
)
SELECT #hash = STRING_AGG(C, '')
FROM cteC
WHERE ASCII(C) >= 32
AND ASCII(C) <= 126
RETURN #hash
END
This is likely to be very slow.
And it'll fail on long strings.

Related

UNDOCUMENTED FEATURE when SELECT in VARCHAR with trailing whitespace SQL Server

I hope this is an interesting puzzle for an SQL expert out there.
When I run the following query, I would expect it to return no results.
-- Create a table variable Note: This same behaviour occurs in standard tables.
DECLARE #TestResults TABLE (Id int IDENTITY(1,1) NOT NULL, Foo VARCHAR(100) NOT NULL, About VARCHAR(1000) NOT NULL)
-- Add some test data Note: Without space, space prefix and space suffix
INSERT INTO #TestResults(Foo, About) VALUES('Bar', 'No spaces')
INSERT INTO #TestResults(Foo, About) VALUES('Bar ', 'Space Suffix')
INSERT INTO #TestResults(Foo, About) VALUES(' Bar', 'Space prefix')
-- SELECT statement that is filtered by a value without a space and also a value with a space suffix
SELECT
t.Foo
, t.About
FROM #TestResults t
WHERE t.Foo like 'Bar '
AND t.Foo like 'Bar'
AND t.Foo = 'Bar '
AND t.Foo = 'Bar'
The results return a single row:
[Foo] [About]
Bar Space Suffix
I need to know more about this behaviour and how I should work around it.
It is also worth noting that LEN(Foo) is odd too, as follows:
DECLARE #TestResults TABLE (Id int IDENTITY(1,1) NOT NULL, Foo VARCHAR(100) NOT NULL, About VARCHAR(1000) NOT NULL)
INSERT INTO #TestResults(Foo, About) VALUES('Bar', 'No spaces')
INSERT INTO #TestResults(Foo, About) VALUES('Bar ', 'Space Suffix')
INSERT INTO #TestResults(Foo, About) VALUES(' Bar', 'Space prefix')
SELECT
t.Foo
, LEN(Foo) [Length]
, t.About
FROM #TestResults t
Gives the following results:
[Foo] [Length] [About]
Bar 3 No spaces
Bar 3 Space Suffix
Bar 4 Space prefix
Without any lateral thinking, what do I need to change my WHERE clause to in order to return 0 results as expected?
The answer is to add the following clause:
AND DATALENGTH(t.Foo) = DATALENGTH('Bar')
Running the following query...
DECLARE #Chars TABLE (CharNumber INT NOT NULL)
DECLARE #CharNumber INT = 0
WHILE(#CharNumber <= 255)
BEGIN
INSERT INTO #Chars(CharNumber) VALUES(#CharNumber)
SET #CharNumber = #CharNumber + 1
END
SELECT
CharNumber
, IIF('Test' = 'Test' + CHAR(CharNumber),1,0) ['Test' = 'Test' + CHAR(CharNumber)]
, IIF('Test' LIKE 'Test' + CHAR(CharNumber),1,0) ['Test' LIKE 'Test' + CHAR(CharNumber)]
, IIF(LEN('Test') = LEN('Test' + CHAR(CharNumber)),1,0) [LEN('Test') = LEN('Test' + CHAR(CharNumber))]
, IIF(DATALENGTH('Test') = DATALENGTH('Test' + CHAR(CharNumber)),1,0) [DATALENGTH('Test') = DATALENGTH('Test' + CHAR(CharNumber))]
FROM #Chars
WHERE ('Test' = 'Test' + CHAR(CharNumber))
OR ('Test' LIKE 'Test' + CHAR(CharNumber))
OR (LEN('Test') = LEN('Test' + CHAR(CharNumber)))
ORDER BY CharNumber
...produces the following results...
CharNumber 'Test' = 'Test' + CHAR(CharNumber) 'Test' LIKE 'Test' + CHAR(CharNumber) LEN('Test') = LEN('Test' + CHAR(CharNumber)) DATALENGTH('Test') = DATALENGTH('Test' + CHAR(CharNumber))
0 1 1 0 0
32 1 0 1 0
37 0 1 0 0
DATALENGTH can be used to test the equality of two VARCHAR, therefore the original query can be corrected as follows:
-- Create a table variable Note: This same behaviour occurs in standard tables.
DECLARE #TestResults TABLE (Id int IDENTITY(1,1) NOT NULL, Foo VARCHAR(100) NOT NULL, About VARCHAR(1000) NOT NULL)
-- Add some test data Note: Without space, space prefix and space suffix
INSERT INTO #TestResults(Foo, About) VALUES('Bar', 'No spaces')
INSERT INTO #TestResults(Foo, About) VALUES('Bar ', 'Space Suffix')
INSERT INTO #TestResults(Foo, About) VALUES(' Bar', 'Space prefix')
-- SELECT statement that is filtered by a value without a space and also a value with a space suffix
SELECT
t.Foo
, t.About
FROM #TestResults t
WHERE t.Foo like 'Bar '
AND t.Foo like 'Bar'
AND t.Foo = 'Bar '
AND t.Foo = 'Bar'
AND DATALENGTH(t.Foo) = DATALENGTH('Bar') -- Additional clause
I also made a function to be used instead of =
ALTER FUNCTION dbo.fVEQ( #VarCharA VARCHAR(MAX), #VarCharB VARCHAR(MAX) )
RETURNS BIT
WITH SCHEMABINDING
AS
BEGIN
-- Added by WonderWorker on 18th March 2020
DECLARE #Result BIT = IIF(
(#VarCharA = #VarCharB AND DATALENGTH(#VarCharA) = DATALENGTH(#VarCharB))
, 1, 0)
RETURN #Result
END
..Here is a test for all 256 characters used as trailing characters to prove that it works..
-- Test fVEQ with all 256 characters
DECLARE #Chars TABLE (CharNumber INT NOT NULL)
DECLARE #CharNumber INT = 0
WHILE(#CharNumber <= 255)
BEGIN
INSERT INTO #Chars(CharNumber) VALUES(#CharNumber)
SET #CharNumber = #CharNumber + 1
END
SELECT
CharNumber
, dbo.fVEQ('Bar','Bar' + CHAR(CharNumber)) [fVEQ Trailing Char Test]
, dbo.fVEQ('Bar','Bar') [fVEQ Same test]
, dbo.fVEQ('Bar',CHAR(CharNumber) + 'Bar') [fVEQ Leading Char Test]
FROM #Chars
WHERE (dbo.fVEQ('Bar','Bar' + CHAR(CharNumber)) = 1)
AND (dbo.fVEQ('Bar','Bar') = 0)
AND (dbo.fVEQ('Bar',CHAR(CharNumber) + 'Bar') = 1)
The reason why trailing whitespace is disregarded in string comparison, is because of the notion of fixed-length string fields, in which any content shorter than the fixed length is automatically right-padded with spaces. Such fixed-length fields cannot distinguish meaningful trailing spaces from padding.
The rationale for why fixed-length string fields even exist, is that they improve performance significantly in many cases, and when SQL was designed it was common for character-based terminals (which usually treated trailing spaces equivalent to padding), reports printed with monospaced fonts (which used trailing spaces for padding and alignment), and data storage and exchange formats (which used fixed-length fields in place of extensive and costly delimiters and complicated parsing logic), to all be oriented around fixed-length fields, so there was a tight integration with this concept at all stages of processing.
When comparing two fixed-length fields of the same fixed length, a literal comparison would of course be possible and would produce correct results.
But when comparing a fixed-length field of a given fixed length, to a fixed-length field of a different fixed length, the desired behaviour would never be to include the trailing spaces in the comparison, since two such fields could never match literally simply by virtue of their differing fixed lengths. The shorter field could be cast and padded to the length of the longer (at least conceptually if not physically), but the trailing space would still then be considered as padding rather than as meaningful.
When comparing a fixed-length field to a variable-length field, the desired behaviour is also probably never to include trailing spaces in the comparison. More complicated approaches which attempt to attribute meaning to trailing spaces in the variable-length side of the comparison, would only come at the cost of slower comparison logic and additional conceptual complexity and potential for error.
In terms of why variable-length to variable-length comparisons ignore trailing spaces, since here spaces can be meaningful in principle, the rationale is probably maintaining consistency in comparison behaviour as when fixed-length fields are involved, and the avoidance of the most common kind of error, since trailing spaces are spurious in databases far more often than they are meaningful.
Nowadays, a database system designed in every respect from scratch would probably forsake fixed-length fields, and probably perform all comparisons literally, leaving the developer to deal explicitly with spurious trailing spaces, but in my experience this would result in extra development effort and far more frequent error than the current arrangement in SQL, where errors in program logic involving the silent disregard of trailing spaces usually only occurs when designing complex string-shredding logic to be used against un-normalised data (which is a kind of data that SQL is specifically not optimised for handling).
So to be clear, this is not an undocumented feature, but a prominent feature that exists by design.
If you change the query to
SELECT
Foo
, About
, CASE WHEN Foo LIKE 'Bar ' THEN 'T' ELSE 'F' END As Like_Bar_Space
, CASE WHEN Foo LIKE 'Bar' THEN 'T' ELSE 'F' END As Like_Bar
, CASE WHEN Foo = 'Bar ' THEN 'T' ELSE 'F' END As EQ_Bar_Space
, CASE WHEN Foo = 'Bar' THEN 'T' ELSE 'F' END As EQ_Bar
FROM #TestResults
it gives you a better overview, as you see the result of the different conditions separately:
Foo About Like_Bar_Space Like_Bar EQ_Bar_Space EQ_Bar
------ ------------ --------------- --------- ------------- ------
Bar No spaces F T T T
Bar Space Suffix T T T T
Bar Space prefix F F F F
It looks like equals = ignores trailing spaces in both searched string and pattern. LIKE, however, does not ignore the trailing space in the pattern but ignores an extra trailing space in the searched string. Leading spaces are never ignored.
I don't know how wrong entries got in there, but you can fix them with
UPDATE #TestResults SET Foo = TRIM(Foo)
You can make a trailing space sensitive test with:
WHERE t.Foo + ";" = pattern + ";"
You can make a trailing space insensitive test with:
WHERE RTRIM(t.Foo) = RTRIM(pattern)

Why does LEN( char(32) ) = 0 in T-SQL?

I wanted to write a function to count the number of delimiters or any substring (which could be a space) in a string of text, throwing a hack error if the delimiter was null or empty:
if len(#lookfor)=0 or #lookfor is null return Cast('substring must not be null or empty' as int)
But if the function is called with #lookfor = ' ' that trips the error.
I am aware of DATALENGTH(). Just curious why a single space is treated as "trailing" if there's nothing before it.
I am aware of DATALENGTH(). Just curious why a single space is treated
as "trailing" if there's nothing before it.
It's trailing because it's at the end of the string. It's also leading since it's the at the beginning.
But if the function is called with #lookfor = '' that trips the error
Something that messes a lot of people up with SQL is how '' = ' '; Note this query:
DECLARE #blank VARCHAR(10) = '', #space VARCHAR(10) = CHAR(32);
SELECT CASE WHEN #blank = #space THEN 'That the...!?!?' END;
You can change #space to CHAR(32)+CHAR(32)+.... and #space and #blank will still be equal.
Complicating things a little more note that the DATALENGTH for a blank/empty value is 0 when it's a VARCHAR(N) but the DATALENGTH is N when for CHAR(N) values. In other words,
SELECT DATALENGTH(CAST('' AS CHAR(1))) returns 1 and SELECT DATALENGTH(CAST('' AS CHAR(10))) returns 10.
That means that if your delimiter variable is say, CHAR(1) - that will mess you up. Here's the function for you:
CREATE FUNCTION dbo.CountDelimiters(#string VARCHAR(8000), #delimiter VARCHAR(1))
RETURNS TABLE WITH SCHEMABINDING AS RETURN
SELECT DCount = MAX(DATALENGTH(#string)-LEN(REPLACE(#string,#delimiter,'')))
WHERE DATALENGTH(#delimiter) > 0;
Note that #delimter is VARCHAR(1) and NOT a CHAR datatype.
The formula to count delimiters in #string is:
DATALENGTH(#string)-LEN(REPLACE(#string,#delimiter,''))
or
(DATALENGTH(#string)-LEN(REPLACE(#string,#delimiter,'')))/DATALENGTH(#delimiter) when dealing with delimiters longer than 1`.
WHERE DATALENGTH(#delimiter) > 0 will force the function to ignore a NULL or blank value. This is known as a Startup Predicate.
Putting a MAX around DATALENGTH(#string)-LEN(REPLACE(#string,#delimiter,'')) forces the function to rerturn a NULL value in the event you pass it a blank or NULL value.
This will return 10 for the number of spaces in my string:
SELECT f.DCount FROM dbo.CountDelimiters('one space two spaces three ', CHAR(32)) AS f;
Against a table you would use the function like this (note that I'm counting the number of times the letter "A" appears:
-- Sample Strings
DECLARE #table TABLE (SomeText VARCHAR(36));
INSERT #table VALUES('ABCABC'),('XXX'),('AAA'),(''),(NULL);
SELECT t.SomeText, f.DCount
FROM #table AS t
CROSS APPLY dbo.CountDelimiters(t.SomeText, 'A') AS f;
Which returns:
SomeText DCount
------------------------------------ -----------
ABCABC 2
XXX 0
AAA 3
0
NULL NULL
If a string has a chacacter at the end, it is considered trailing, even if there are no other characters before it. Same for logic regarding leading characters.
So ' ' can be considered an empty string ('') having a trailing space.
When I started using SQL, I also noticed the behavior that the LEN function ignores trailing spaces. And I think (but I am not sure) that is has to do with the fact that LEN should probably also behave "correctly" when used with CHAR/NCHAR values. Unlike VARCHAR/NVARCHAR, the CHAR/NCHAR values have a fixed width and will be filled with trailing spaces automatically. So when you put value 'abc' in a field/variable of type CHAR(5), the value will become 'abc ', but the LEN function will still "correctly" return 3 in that case.
I consider this just to be a strange quirk of SQL.
Remark:
The DATALENGTH function will not ignore trailing spaces in VARCHAR/NVARCHAR values. But note that DATALENGTH will return the size in bytes of the field's value. So if you use unicode data (NCHAR/NVARCHAR), the DATALENGTH function will return 6 for value N'abc', because each unicode character in SQL Server uses 2 bytes!

Removing leading and trailing commas

I am trying to find a way to remove trailing and leading commas in the SELECT statement. Here is some sample data:
SELECT
GRAIN, MATERIAL, BACKING, GRITS,
REPLACE(LTRIM(RTRIM(REPLACE(PROPERTIES, ',', ' '))), ' ', ',') PROPERTIES,
SPECIAL, APPLICATION, PRODUCTTYPE
FROM PRODUCTFINDER
I tried using trim, rtrim, and ltrim but none of them changed the strings.. Idk if I was using the wrong syntax or what, but could someone help me please?
I am using SQL Server 2008.
Just another option.
This is a non-destructive approach that will eliminate any number of repeating commas and forces a final cleanup via the double pipes
For the expansion,reduction, and elimination I picked two obscure characters †‡
Example
Declare #S varchar(max) =',,,,Some,,,,,Content,,,'
Select
replace(
replace(
replace(
replace(
replace('||,' + #S + ',||', ',', '†‡'),
'‡†', ''
),
'†‡', ','
),
'||,', ''
),
',||', ''
)
Returns
Some,Content
EDIT - Removed the LTRIM()/RTRIM()
Try this:
SELECT
GRAIN, MATERIAL, BACKING, GRITS,
TRIM(',' FROM PRODUCTFINDER.PROPERTIES) AS PROPERTIES,
TRIM(',' FROM PRODUCTFINDER.SPECIAL) AS SPECIAL,
TRIM(',' FROM PRODUCTFINDER.APPLICATION) AS APPLICATION,
TRIM(',' FROM PRODUCTFINDER.PRODUCTTYPE) AS PRODUCTTYPE
FROM PRODUCTFINDER
I am not sure which columns you want to trim.
This variant of TRIM (Transact-SQL) is available since SQL-Server 2017.
If you have an earlier version of SQL-Server, do this in the Font-End (VB). This also gives you the possibility to replace multiple commas by single ones in the middle of the text.
Dim s = ",,,Abc,,,Def,Xyz,,,"
Console.WriteLine(Regex.Replace(s, ",{2,}", ",").Trim(","c))
Prints
Abc,Def,Xyz
Regex.Replace(s, ",{2,}", ",") uses the a regular expression ,{2,} to find 2 or more occurrences of commas and replaces them by one single comma. .Trim(","c) removes leading and trailing commas.
For Regex you need a
Imports System.Text.RegularExpressions
Another variant uses string split with the RemoveEmptyEntries option and then joins the parts again to form the result.
Dim s = ",,,Abc,,,Def,Xyz,,,"
Dim parts As String() = s.Split(New Char() {","c}, StringSplitOptions.RemoveEmptyEntries)
Console.WriteLine(String.Join(",", parts))
Here's one method using PATINDEX with LEFT and RIGHT.
declare #var varchar(64)= ',,,,,,,,asdf,dsf,sdfsd,asdf,,,,,,,,'
select
left(right(#var,len(#var) - patindex('%[^,]%',#var) + 1)
,len(right(#var,len(#var) - patindex('%[^,]%',#var) + 1)) - patindex('%[^,]%',reverse(right(#var,len(#var) - patindex('%[^,]%',#var) + 1))) + 1)
Just change #var to your column name.
This code strips the leading commas by searching for the first value that isn't a comma, via patindex('%[^,]%',#var) and takes everything to the RIGHT of this character. Then, we do the same thing using LEFT to remove the trailing commas.
select
Special = left(right(Special,len(Special) - patindex('%[^,]%',Special) + 1),len(right(Special,len(Special) - patindex('%[^,]%',Special) + 1)) - patindex('%[^,]%',reverse(right(Special,len(Special) - patindex('%[^,]%',Special) + 1))) + 1)
,[Application] = left(right([Application],len([Application]) - patindex('%[^,]%',[Application]) + 1),len(right([Application],len([Application]) - patindex('%[^,]%',[Application]) + 1)) - patindex('%[^,]%',reverse(right([Application],len([Application]) - patindex('%[^,]%',[Application]) + 1))) + 1)
,[ProductType] = left(right([ProductType],len([ProductType]) - patindex('%[^,]%',[ProductType]) + 1),len(right([ProductType],len([ProductType]) - patindex('%[^,]%',[ProductType]) + 1)) - patindex('%[^,]%',reverse(right([ProductType],len([ProductType]) - patindex('%[^,]%',[ProductType]) + 1))) + 1)
FROM PRODUCTFINDER
SQL Server is not ideal place for manipulating strings so trim logic should be at programming level
As far as trimming particular character is required in query, refer to below thread
Trimming any Leading or trailing characters

Select with Left Function Condition SQL Server [duplicate]

I've been using this for some time:
SUBSTRING(str_col, PATINDEX('%[^0]%', str_col), LEN(str_col))
However recently, I've found a problem with columns with all "0" characters like '00000000' because it never finds a non-"0" character to match.
An alternative technique I've seen is to use TRIM:
REPLACE(LTRIM(REPLACE(str_col, '0', ' ')), ' ', '0')
This has a problem if there are embedded spaces, because they will be turned into "0"s when the spaces are turned back into "0"s.
I'm trying to avoid a scalar UDF. I've found a lot of performance problems with UDFs in SQL Server 2005.
SUBSTRING(str_col, PATINDEX('%[^0]%', str_col+'.'), LEN(str_col))
Why don't you just cast the value to INTEGER and then back to VARCHAR?
SELECT CAST(CAST('000000000' AS INTEGER) AS VARCHAR)
--------
0
Other answers here to not take into consideration if you have all-zero's (or even a single zero).
Some always default an empty string to zero, which is wrong when it is supposed to remain blank.
Re-read the original question. This answers what the Questioner wants.
Solution #1:
--This example uses both Leading and Trailing zero's.
--Avoid losing those Trailing zero's and converting embedded spaces into more zeros.
--I added a non-whitespace character ("_") to retain trailing zero's after calling Replace().
--Simply remove the RTrim() function call if you want to preserve trailing spaces.
--If you treat zero's and empty-strings as the same thing for your application,
-- then you may skip the Case-Statement entirely and just use CN.CleanNumber .
DECLARE #WackadooNumber VarChar(50) = ' 0 0123ABC D0 '--'000'--
SELECT WN.WackadooNumber, CN.CleanNumber,
(CASE WHEN WN.WackadooNumber LIKE '%0%' AND CN.CleanNumber = '' THEN '0' ELSE CN.CleanNumber END)[AllowZero]
FROM (SELECT #WackadooNumber[WackadooNumber]) AS WN
OUTER APPLY (SELECT RTRIM(RIGHT(WN.WackadooNumber, LEN(LTRIM(REPLACE(WN.WackadooNumber + '_', '0', ' '))) - 1))[CleanNumber]) AS CN
--Result: "123ABC D0"
Solution #2 (with sample data):
SELECT O.Type, O.Value, Parsed.Value[WrongValue],
(CASE WHEN CHARINDEX('0', T.Value) > 0--If there's at least one zero.
AND LEN(Parsed.Value) = 0--And the trimmed length is zero.
THEN '0' ELSE Parsed.Value END)[FinalValue],
(CASE WHEN CHARINDEX('0', T.Value) > 0--If there's at least one zero.
AND LEN(Parsed.TrimmedValue) = 0--And the trimmed length is zero.
THEN '0' ELSE LTRIM(RTRIM(Parsed.TrimmedValue)) END)[FinalTrimmedValue]
FROM
(
VALUES ('Null', NULL), ('EmptyString', ''),
('Zero', '0'), ('Zero', '0000'), ('Zero', '000.000'),
('Spaces', ' 0 A B C '), ('Number', '000123'),
('AlphaNum', '000ABC123'), ('NoZero', 'NoZerosHere')
) AS O(Type, Value)--O is for Original.
CROSS APPLY
( --This Step is Optional. Use if you also want to remove leading spaces.
SELECT LTRIM(RTRIM(O.Value))[Value]
) AS T--T is for Trimmed.
CROSS APPLY
( --From #CadeRoux's Post.
SELECT SUBSTRING(O.Value, PATINDEX('%[^0]%', O.Value + '.'), LEN(O.Value))[Value],
SUBSTRING(T.Value, PATINDEX('%[^0]%', T.Value + '.'), LEN(T.Value))[TrimmedValue]
) AS Parsed
Results:
Summary:
You could use what I have above for a one-off removal of leading-zero's.
If you plan on reusing it a lot, then place it in an Inline-Table-Valued-Function (ITVF).
Your concerns about performance problems with UDF's is understandable.
However, this problem only applies to All-Scalar-Functions and Multi-Statement-Table-Functions.
Using ITVF's is perfectly fine.
I have the same problem with our 3rd-Party database.
With Alpha-Numeric fields many are entered in without the leading spaces, dang humans!
This makes joins impossible without cleaning up the missing leading-zeros.
Conclusion:
Instead of removing the leading-zeros, you may want to consider just padding your trimmed-values with leading-zeros when you do your joins.
Better yet, clean up your data in the table by adding leading zeros, then rebuilding your indexes.
I think this would be WAY faster and less complex.
SELECT RIGHT('0000000000' + LTRIM(RTRIM(NULLIF(' 0A10 ', ''))), 10)--0000000A10
SELECT RIGHT('0000000000' + LTRIM(RTRIM(NULLIF('', ''))), 10)--NULL --When Blank.
Instead of a space replace the 0's with a 'rare' whitespace character that shouldn't normally be in the column's text. A line feed is probably good enough for a column like this. Then you can LTrim normally and replace the special character with 0's again.
My version of this is an adaptation of Arvo's work, with a little more added on to ensure two other cases.
1) If we have all 0s, we should return the digit 0.
2) If we have a blank, we should still return a blank character.
CASE
WHEN PATINDEX('%[^0]%', str_col + '.') > LEN(str_col) THEN RIGHT(str_col, 1)
ELSE SUBSTRING(str_col, PATINDEX('%[^0]%', str_col + '.'), LEN(str_col))
END
The following will return '0' if the string consists entirely of zeros:
CASE WHEN SUBSTRING(str_col, PATINDEX('%[^0]%', str_col+'.'), LEN(str_col)) = '' THEN '0' ELSE SUBSTRING(str_col, PATINDEX('%[^0]%', str_col+'.'), LEN(str_col)) END AS str_col
This makes a nice Function....
DROP FUNCTION [dbo].[FN_StripLeading]
GO
CREATE FUNCTION [dbo].[FN_StripLeading] (#string VarChar(128), #stripChar VarChar(1))
RETURNS VarChar(128)
AS
BEGIN
-- http://stackoverflow.com/questions/662383/better-techniques-for-trimming-leading-zeros-in-sql-server
DECLARE #retVal VarChar(128),
#pattern varChar(10)
SELECT #pattern = '%[^'+#stripChar+']%'
SELECT #retVal = CASE WHEN SUBSTRING(#string, PATINDEX(#pattern, #string+'.'), LEN(#string)) = '' THEN #stripChar ELSE SUBSTRING(#string, PATINDEX(#pattern, #string+'.'), LEN(#string)) END
RETURN (#retVal)
END
GO
GRANT EXECUTE ON [dbo].[FN_StripLeading] TO PUBLIC
cast(value as int) will always work if string is a number
SELECT CAST(CAST('000000000' AS INTEGER) AS VARCHAR)
This has a limit on the length of the string that can be converted to an INT
If you are using Snowflake SQL, might use this:
ltrim(str_col,'0')
The ltrim function removes all instances of the designated set of characters from the left side.
So ltrim(str_col,'0') on '00000008A' would return '8A'
And rtrim(str_col,'0.') on '$125.00' would return '$125'
This might help
SELECT ABS(column_name) FROM [db].[schema].[table]
replace(ltrim(replace(Fieldname.TableName, '0', '')), '', '0')
The suggestion from Thomas G worked for our needs.
The field in our case was already string and only the leading zeros needed to be trimmed. Mostly it's all numeric but sometimes there are letters so the previous INT conversion would crash.
For converting number as varchar to int, you could also use simple
(column + 0)
Very easy way, when you just work with numeric values:
SELECT
TRY_CONVERT(INT, '000053830')
Try this:
replace(ltrim(replace(#str, '0', ' ')), ' ', '0')
If you do not want to convert into int, I prefer this below logic because it can handle nulls
IFNULL(field,LTRIM(field,'0'))
SUBSTRING(str_col, IIF(LEN(str_col) > 0, PATINDEX('%[^0]%', LEFT(str_col, LEN(str_col) - 1) + '.'), 0), LEN(str_col))
Works fine even with '0', '00' and so on.
Starting with SQL Server 2022 (16.x) you can do this
TRIM ( [ LEADING | TRAILING | BOTH ] [characters FROM ] string )
In MySQL you can do this...
Trim(Leading '0' from your_column)

Check if a string contains a substring in SQL Server 2005, using a stored procedure

I've a string, #mainString = 'CATCH ME IF YOU CAN'. I want to check whether the word ME is inside #mainString.
How do I check if a string has a specific substring in SQL?
CHARINDEX() searches for a substring within a larger string, and returns the position of the match, or 0 if no match is found
if CHARINDEX('ME',#mainString) > 0
begin
--do something
end
Edit or from daniels answer, if you're wanting to find a word (and not subcomponents of words), your CHARINDEX call would look like:
CHARINDEX(' ME ',' ' + REPLACE(REPLACE(#mainString,',',' '),'.',' ') + ' ')
(Add more recursive REPLACE() calls for any other punctuation that may occur)
You can just use wildcards in the predicate (after IF, WHERE or ON):
#mainstring LIKE '%' + #substring + '%'
or in this specific case
' ' + #mainstring + ' ' LIKE '% ME[., ]%'
(Put the spaces in the quoted string if you're looking for the whole word, or leave them out if ME can be part of a bigger word).

Resources