Why does LEN( char(32) ) = 0 in T-SQL? - sql-server

I wanted to write a function to count the number of delimiters or any substring (which could be a space) in a string of text, throwing a hack error if the delimiter was null or empty:
if len(#lookfor)=0 or #lookfor is null return Cast('substring must not be null or empty' as int)
But if the function is called with #lookfor = ' ' that trips the error.
I am aware of DATALENGTH(). Just curious why a single space is treated as "trailing" if there's nothing before it.

I am aware of DATALENGTH(). Just curious why a single space is treated
as "trailing" if there's nothing before it.
It's trailing because it's at the end of the string. It's also leading since it's the at the beginning.
But if the function is called with #lookfor = '' that trips the error
Something that messes a lot of people up with SQL is how '' = ' '; Note this query:
DECLARE #blank VARCHAR(10) = '', #space VARCHAR(10) = CHAR(32);
SELECT CASE WHEN #blank = #space THEN 'That the...!?!?' END;
You can change #space to CHAR(32)+CHAR(32)+.... and #space and #blank will still be equal.
Complicating things a little more note that the DATALENGTH for a blank/empty value is 0 when it's a VARCHAR(N) but the DATALENGTH is N when for CHAR(N) values. In other words,
SELECT DATALENGTH(CAST('' AS CHAR(1))) returns 1 and SELECT DATALENGTH(CAST('' AS CHAR(10))) returns 10.
That means that if your delimiter variable is say, CHAR(1) - that will mess you up. Here's the function for you:
CREATE FUNCTION dbo.CountDelimiters(#string VARCHAR(8000), #delimiter VARCHAR(1))
RETURNS TABLE WITH SCHEMABINDING AS RETURN
SELECT DCount = MAX(DATALENGTH(#string)-LEN(REPLACE(#string,#delimiter,'')))
WHERE DATALENGTH(#delimiter) > 0;
Note that #delimter is VARCHAR(1) and NOT a CHAR datatype.
The formula to count delimiters in #string is:
DATALENGTH(#string)-LEN(REPLACE(#string,#delimiter,''))
or
(DATALENGTH(#string)-LEN(REPLACE(#string,#delimiter,'')))/DATALENGTH(#delimiter) when dealing with delimiters longer than 1`.
WHERE DATALENGTH(#delimiter) > 0 will force the function to ignore a NULL or blank value. This is known as a Startup Predicate.
Putting a MAX around DATALENGTH(#string)-LEN(REPLACE(#string,#delimiter,'')) forces the function to rerturn a NULL value in the event you pass it a blank or NULL value.
This will return 10 for the number of spaces in my string:
SELECT f.DCount FROM dbo.CountDelimiters('one space two spaces three ', CHAR(32)) AS f;
Against a table you would use the function like this (note that I'm counting the number of times the letter "A" appears:
-- Sample Strings
DECLARE #table TABLE (SomeText VARCHAR(36));
INSERT #table VALUES('ABCABC'),('XXX'),('AAA'),(''),(NULL);
SELECT t.SomeText, f.DCount
FROM #table AS t
CROSS APPLY dbo.CountDelimiters(t.SomeText, 'A') AS f;
Which returns:
SomeText DCount
------------------------------------ -----------
ABCABC 2
XXX 0
AAA 3
0
NULL NULL

If a string has a chacacter at the end, it is considered trailing, even if there are no other characters before it. Same for logic regarding leading characters.
So ' ' can be considered an empty string ('') having a trailing space.
When I started using SQL, I also noticed the behavior that the LEN function ignores trailing spaces. And I think (but I am not sure) that is has to do with the fact that LEN should probably also behave "correctly" when used with CHAR/NCHAR values. Unlike VARCHAR/NVARCHAR, the CHAR/NCHAR values have a fixed width and will be filled with trailing spaces automatically. So when you put value 'abc' in a field/variable of type CHAR(5), the value will become 'abc ', but the LEN function will still "correctly" return 3 in that case.
I consider this just to be a strange quirk of SQL.
Remark:
The DATALENGTH function will not ignore trailing spaces in VARCHAR/NVARCHAR values. But note that DATALENGTH will return the size in bytes of the field's value. So if you use unicode data (NCHAR/NVARCHAR), the DATALENGTH function will return 6 for value N'abc', because each unicode character in SQL Server uses 2 bytes!

Related

Is there a way to find values that contain only 0's and a symbol of any length?

I want to find strings of any length that contain only 0's and a symbol such as a / a . or a -
Examples include 0__0 and 000/00/00000 and .00000
Considering this sample data:
CREATE TABLE dbo.things(thing varchar(255));
INSERT dbo.things(thing) VALUES
('0__0'),('000/00/00000'),('00000'),('0123456');
Try the following, which locates the first position of any character that is NOT a 0, a decimal, a forward slash, or an underscore. PATINDEX returns 0 if the pattern is not found.
SELECT thing FROM dbo.things
WHERE PATINDEX('%[^0^.^/^_]%', thing) = 0;
Results:
thing
0__0
000/00/00000
00000
The opposite:
SELECT thing FROM dbo.things
WHERE PATINDEX('%[^0^.^/^_]%', thing) > 0;
Results:
thing
0123food456
Example db<>fiddle
I can see a way of doing this... But it's something that wouldn't perform well, if you think about using it as a search criteria.
We are going to use a translate function on SQL Server, to replace the allowed characters, or symbols as you've said, with a zero. And then, eliminates the zeroes. If the result is an empty string, then there are two cases, or it only had zeroes and allowed characters, or it already was an empty string.
So, checking for this and for non-empty strings, we can define if it matches your criteria.
-- Test scenario
create table #example (something varchar(200) )
insert into #example(something) values
--Example cases from Stack Overflow
('0__0'),('000/00/00000'),('.00000'),
-- With something not allowed (don't know, just put a number)
('1230__0'),('000/04560/00000'),('.00000789'),
-- Just not allowed characters, zero, blank, and NULL
('1234567489'),('0'), (''),(null)
-- Shows the data, with a column to check if it matches your criteria
select *
from #example e
cross apply (
select case when
-- If it *must* have at least a zero
e.something like '%0%' and
-- Eliminates zeroes
replace(
-- Replaces the allowed characters with zero
translate(
e.something
,'_./'
,'000'
)
,'0'
,''
) = ''
then cast(1 as bit)
else cast(0 as bit)
end as doesItMatch
) as criteria(doesItMatch)
I really discourage you from using this as a search criteria.
-- Queries the table over this criteria.
-- This is going to compute over your entire table, so it can get very CPU intensive
select *
from #example e
where
-- If it *must* have at least a zero
e.something like '%0%' and
-- Eliminates zeroes
replace(
-- Replaces the allowed characters with zero
translate(
e.something
,'_./'
,'000'
)
,'0'
,''
) = ''
If you must use this as a search criteria, and this will be a common filter on your application, I suggest you create a new bit column, to flag if it matches this, and index it. Thus, the increase in computational effort would be spread on the inserts/updates/deletes, and the search queries won't overloading the database.
The code can be seen executing here, on DB Fiddle.
What I got from the question is that the strings must contain both 0 and any combination of the special characters in the string.
If you have SQL Server 2017 and above, you can use translate() to replace multiple characters with a space and compare this with the empty string. Also you can use LIKE to enforce that both a 0 and any combination of the special character(s) appear at least once:
DECLARE #temp TABLE (val varchar(100))
INSERT INTO #temp VALUES
('0__0'), ('000/00/00000'), ('.00000'), ('w0hee/'), ('./')
SELECT *
FROM #temp
WHERE val LIKE '%0%' --must have at least one zero somewhere
AND val LIKE '%[_/.]%' --must have at least one special character(s) somewhere
AND TRANSLATE(val, '0./_', ' ') = '' --translated zeros and sp characters to spaces equivalent to an empty string
Creates output:
val
0__0
000/00/00000
.00000

does LEN() have bug?

It seems that LEN() ignores whitespaces at the right side of a variable.
declare #a varchar(100)
set #a = 'John '
print len(#a)
The above code prints 4 whereas it should be 7.
Is this a bug?
This is not a bug, this is the intended behavior. To quote the documentation:
LEN excludes trailing spaces. If that is a problem, consider using the DATALENGTH (Transact-SQL) function which does not trim the string
Not a bug, is right there in the documentation:
Returns the number of characters of the specified string expression, excluding trailing spaces.
Please read the official documentation. It is expected behaviour. https://learn.microsoft.com/en-us/sql/t-sql/functions/len-transact-sql?view=sql-server-ver15#remarks
Thank you folks,
I really didn't know that LEN() ignores trailing spaces!
I think DATALENGTH() does not return a correct answer always.
For NVARCHAR() type it returns twice the number of characters in a string. In fact, it returns the total number of bytes that the string consumes in memory.
In my opinion in order to get the correct length of a string value, we should use a formula like below:
CREATE FUNCTION dbo.Length(#x NVARCHAR(MAX)) RETURNS INT
AS
BEGIN
RETURN (CASE WHEN RIGHT(#x, 1) = ' ' THEN LEN(REPLACE(#x, ' ', '$')) ELSE LEN(#x) END)
END
The function is not efficient. I know. But, it returns a correct answer at least.

UNDOCUMENTED FEATURE when SELECT in VARCHAR with trailing whitespace SQL Server

I hope this is an interesting puzzle for an SQL expert out there.
When I run the following query, I would expect it to return no results.
-- Create a table variable Note: This same behaviour occurs in standard tables.
DECLARE #TestResults TABLE (Id int IDENTITY(1,1) NOT NULL, Foo VARCHAR(100) NOT NULL, About VARCHAR(1000) NOT NULL)
-- Add some test data Note: Without space, space prefix and space suffix
INSERT INTO #TestResults(Foo, About) VALUES('Bar', 'No spaces')
INSERT INTO #TestResults(Foo, About) VALUES('Bar ', 'Space Suffix')
INSERT INTO #TestResults(Foo, About) VALUES(' Bar', 'Space prefix')
-- SELECT statement that is filtered by a value without a space and also a value with a space suffix
SELECT
t.Foo
, t.About
FROM #TestResults t
WHERE t.Foo like 'Bar '
AND t.Foo like 'Bar'
AND t.Foo = 'Bar '
AND t.Foo = 'Bar'
The results return a single row:
[Foo] [About]
Bar Space Suffix
I need to know more about this behaviour and how I should work around it.
It is also worth noting that LEN(Foo) is odd too, as follows:
DECLARE #TestResults TABLE (Id int IDENTITY(1,1) NOT NULL, Foo VARCHAR(100) NOT NULL, About VARCHAR(1000) NOT NULL)
INSERT INTO #TestResults(Foo, About) VALUES('Bar', 'No spaces')
INSERT INTO #TestResults(Foo, About) VALUES('Bar ', 'Space Suffix')
INSERT INTO #TestResults(Foo, About) VALUES(' Bar', 'Space prefix')
SELECT
t.Foo
, LEN(Foo) [Length]
, t.About
FROM #TestResults t
Gives the following results:
[Foo] [Length] [About]
Bar 3 No spaces
Bar 3 Space Suffix
Bar 4 Space prefix
Without any lateral thinking, what do I need to change my WHERE clause to in order to return 0 results as expected?
The answer is to add the following clause:
AND DATALENGTH(t.Foo) = DATALENGTH('Bar')
Running the following query...
DECLARE #Chars TABLE (CharNumber INT NOT NULL)
DECLARE #CharNumber INT = 0
WHILE(#CharNumber <= 255)
BEGIN
INSERT INTO #Chars(CharNumber) VALUES(#CharNumber)
SET #CharNumber = #CharNumber + 1
END
SELECT
CharNumber
, IIF('Test' = 'Test' + CHAR(CharNumber),1,0) ['Test' = 'Test' + CHAR(CharNumber)]
, IIF('Test' LIKE 'Test' + CHAR(CharNumber),1,0) ['Test' LIKE 'Test' + CHAR(CharNumber)]
, IIF(LEN('Test') = LEN('Test' + CHAR(CharNumber)),1,0) [LEN('Test') = LEN('Test' + CHAR(CharNumber))]
, IIF(DATALENGTH('Test') = DATALENGTH('Test' + CHAR(CharNumber)),1,0) [DATALENGTH('Test') = DATALENGTH('Test' + CHAR(CharNumber))]
FROM #Chars
WHERE ('Test' = 'Test' + CHAR(CharNumber))
OR ('Test' LIKE 'Test' + CHAR(CharNumber))
OR (LEN('Test') = LEN('Test' + CHAR(CharNumber)))
ORDER BY CharNumber
...produces the following results...
CharNumber 'Test' = 'Test' + CHAR(CharNumber) 'Test' LIKE 'Test' + CHAR(CharNumber) LEN('Test') = LEN('Test' + CHAR(CharNumber)) DATALENGTH('Test') = DATALENGTH('Test' + CHAR(CharNumber))
0 1 1 0 0
32 1 0 1 0
37 0 1 0 0
DATALENGTH can be used to test the equality of two VARCHAR, therefore the original query can be corrected as follows:
-- Create a table variable Note: This same behaviour occurs in standard tables.
DECLARE #TestResults TABLE (Id int IDENTITY(1,1) NOT NULL, Foo VARCHAR(100) NOT NULL, About VARCHAR(1000) NOT NULL)
-- Add some test data Note: Without space, space prefix and space suffix
INSERT INTO #TestResults(Foo, About) VALUES('Bar', 'No spaces')
INSERT INTO #TestResults(Foo, About) VALUES('Bar ', 'Space Suffix')
INSERT INTO #TestResults(Foo, About) VALUES(' Bar', 'Space prefix')
-- SELECT statement that is filtered by a value without a space and also a value with a space suffix
SELECT
t.Foo
, t.About
FROM #TestResults t
WHERE t.Foo like 'Bar '
AND t.Foo like 'Bar'
AND t.Foo = 'Bar '
AND t.Foo = 'Bar'
AND DATALENGTH(t.Foo) = DATALENGTH('Bar') -- Additional clause
I also made a function to be used instead of =
ALTER FUNCTION dbo.fVEQ( #VarCharA VARCHAR(MAX), #VarCharB VARCHAR(MAX) )
RETURNS BIT
WITH SCHEMABINDING
AS
BEGIN
-- Added by WonderWorker on 18th March 2020
DECLARE #Result BIT = IIF(
(#VarCharA = #VarCharB AND DATALENGTH(#VarCharA) = DATALENGTH(#VarCharB))
, 1, 0)
RETURN #Result
END
..Here is a test for all 256 characters used as trailing characters to prove that it works..
-- Test fVEQ with all 256 characters
DECLARE #Chars TABLE (CharNumber INT NOT NULL)
DECLARE #CharNumber INT = 0
WHILE(#CharNumber <= 255)
BEGIN
INSERT INTO #Chars(CharNumber) VALUES(#CharNumber)
SET #CharNumber = #CharNumber + 1
END
SELECT
CharNumber
, dbo.fVEQ('Bar','Bar' + CHAR(CharNumber)) [fVEQ Trailing Char Test]
, dbo.fVEQ('Bar','Bar') [fVEQ Same test]
, dbo.fVEQ('Bar',CHAR(CharNumber) + 'Bar') [fVEQ Leading Char Test]
FROM #Chars
WHERE (dbo.fVEQ('Bar','Bar' + CHAR(CharNumber)) = 1)
AND (dbo.fVEQ('Bar','Bar') = 0)
AND (dbo.fVEQ('Bar',CHAR(CharNumber) + 'Bar') = 1)
The reason why trailing whitespace is disregarded in string comparison, is because of the notion of fixed-length string fields, in which any content shorter than the fixed length is automatically right-padded with spaces. Such fixed-length fields cannot distinguish meaningful trailing spaces from padding.
The rationale for why fixed-length string fields even exist, is that they improve performance significantly in many cases, and when SQL was designed it was common for character-based terminals (which usually treated trailing spaces equivalent to padding), reports printed with monospaced fonts (which used trailing spaces for padding and alignment), and data storage and exchange formats (which used fixed-length fields in place of extensive and costly delimiters and complicated parsing logic), to all be oriented around fixed-length fields, so there was a tight integration with this concept at all stages of processing.
When comparing two fixed-length fields of the same fixed length, a literal comparison would of course be possible and would produce correct results.
But when comparing a fixed-length field of a given fixed length, to a fixed-length field of a different fixed length, the desired behaviour would never be to include the trailing spaces in the comparison, since two such fields could never match literally simply by virtue of their differing fixed lengths. The shorter field could be cast and padded to the length of the longer (at least conceptually if not physically), but the trailing space would still then be considered as padding rather than as meaningful.
When comparing a fixed-length field to a variable-length field, the desired behaviour is also probably never to include trailing spaces in the comparison. More complicated approaches which attempt to attribute meaning to trailing spaces in the variable-length side of the comparison, would only come at the cost of slower comparison logic and additional conceptual complexity and potential for error.
In terms of why variable-length to variable-length comparisons ignore trailing spaces, since here spaces can be meaningful in principle, the rationale is probably maintaining consistency in comparison behaviour as when fixed-length fields are involved, and the avoidance of the most common kind of error, since trailing spaces are spurious in databases far more often than they are meaningful.
Nowadays, a database system designed in every respect from scratch would probably forsake fixed-length fields, and probably perform all comparisons literally, leaving the developer to deal explicitly with spurious trailing spaces, but in my experience this would result in extra development effort and far more frequent error than the current arrangement in SQL, where errors in program logic involving the silent disregard of trailing spaces usually only occurs when designing complex string-shredding logic to be used against un-normalised data (which is a kind of data that SQL is specifically not optimised for handling).
So to be clear, this is not an undocumented feature, but a prominent feature that exists by design.
If you change the query to
SELECT
Foo
, About
, CASE WHEN Foo LIKE 'Bar ' THEN 'T' ELSE 'F' END As Like_Bar_Space
, CASE WHEN Foo LIKE 'Bar' THEN 'T' ELSE 'F' END As Like_Bar
, CASE WHEN Foo = 'Bar ' THEN 'T' ELSE 'F' END As EQ_Bar_Space
, CASE WHEN Foo = 'Bar' THEN 'T' ELSE 'F' END As EQ_Bar
FROM #TestResults
it gives you a better overview, as you see the result of the different conditions separately:
Foo About Like_Bar_Space Like_Bar EQ_Bar_Space EQ_Bar
------ ------------ --------------- --------- ------------- ------
Bar No spaces F T T T
Bar Space Suffix T T T T
Bar Space prefix F F F F
It looks like equals = ignores trailing spaces in both searched string and pattern. LIKE, however, does not ignore the trailing space in the pattern but ignores an extra trailing space in the searched string. Leading spaces are never ignored.
I don't know how wrong entries got in there, but you can fix them with
UPDATE #TestResults SET Foo = TRIM(Foo)
You can make a trailing space sensitive test with:
WHERE t.Foo + ";" = pattern + ";"
You can make a trailing space insensitive test with:
WHERE RTRIM(t.Foo) = RTRIM(pattern)

Passing value in a function without quote T-SQL / SQL Server 2012?

I need assistance with a function in SQL Server 2012 that I created to check for the input value. If the functions detects a numeric - return 0, if detect character return 1.
But I get 2 different result for the same number passing it with quote and without quote.
select dbo.IS_ALIEN('56789')
returns 0
select dbo.IS_ALIEN(56789)
returns 1 (I need to return 0)
This is my function:
ALTER FUNCTION [dbo].[IS_ALIEN]
(#alienNAIC CHAR(1))
RETURNS NUMERIC(10,0)
AS
BEGIN
DECLARE #nNum NUMERIC(1,0);
BEGIN
SET #NnUM = ISNUMERIC(#alienNAIC)
END
BEGIN
IF #nNum = 1
RETURN 0;
END
RETURN 1;
END
Same concept for:
select dbo.IS_ALIEN('AA-11990043')
returns 1
or
select dbo.IS_ALIEN(NULL)
returns 1 (I need it to return 0)
I'm using Oracle function reference (below code is just reference from old database):
create or replace FUNCTION "IS_ALIEN"
( alienNAIC IN char )
RETURN NUMBER IS
nNum number;
BEGIN
SELECT MOD(alienNAIC, 2) into nNum FROM dual;
return 0;
EXCEPTION
WHEN INVALID_NUMBER THEN
return 1;
END;
But T-SQL function doesn't allow make exception of error. So I try to converted as much closer.
I suggest you use something like this (I've trimmed it down somewhat):
ALTER FUNCTION [dbo].[IS_ALIEN](#alienNAIC NVARCHAR(10))
RETURNS INT -- NOTE: You could also return tinyint or bit
AS
BEGIN
IF ISNUMERIC(#alienNAIC) = 1
RETURN 0;
RETURN 1;
END
The trouble with what you were trying is that there's an implicit cast to CHAR(1), the result of which is definitely not numeric as #Joel pointed out:
SELECT CAST(0 As CHAR(1)) -- returns character '0', ISNUMERIC(0) = 1
SELECT CAST(9 As CHAR(1)) -- returns character '9', ISNUMERIC(0) = 1
SELECT CAST(12345 As CHAR(1)) -- any number over 9 returns character '*', ISNUMERIC(12345) = 0
It's an odd implicit casting case I hadn't seen before. By making the parameter an NVARCHAR (assumes possible future double-byte input), strings will be correctly checked and integers passed in will be implicitly cast as NVARCHAR, and the ISNUMERIC check will succeed.
EDIT
Re-reading the question and comments, it looks like you want to identify a particular string syntax to determine if something is an "alien" or not. If you're comfortable changing business logic to fix what apparently is a poor legacy implementation, you could consider something like this instead:
ALTER FUNCTION [dbo].[temp](#alienNAIC NVARCHAR(10))
RETURNS INT -- NOTE: You could also return tinyint or bit
AS
BEGIN
IF #alienNAIC like 'AA-%' AND ISNUMERIC(RIGHT(#alienNAIC, LEN(#alienNAIC) - 3)) = 1
RETURN 1; -- notice this is now 1 instead of 0, we're doing a specific check for 'AA-nnnnn...'
RETURN 0;
END
Note that this should be thoroughly tested against legacy data if it's ever to interact with it - you never know what rubbish data a poorly written legacy system has left behind. Fixing this could well break other things. If you do make this change, document it well.
If you need to check just the first character then you can do like that:
CREATE FUNCTION [dbo].[IS_ALIEN]
(#alienNAIC VARCHAR(200))
RETURNS TINYINT
AS
BEGIN
IF LEFT(#alienNAIC,1) BETWEEN '0' AND '9' RETURN 1;
RETURN 0
END
GO
It seems like you are trying to check whether a string starts with a non-numeric character. Such pattern matches can be performed using LIKE, eg
declare #var nvarchar(10)='A56789'
select
case when #var LIKE '[0-9]%'
then 0 else 1
end AS IsAlien
Returns
1
Both declare #var nvarchar(10)=56789 and declare #var int=56789 return 0 because the number is implicitly converted to a string.
The expression is so short that you probably don't need to convert it to a function. If you do, it could be as simple as :
create FUNCTION [dbo].[IS_ALIEN] (#alienNAIC varchar(200))
RETURNS INT
begin
return case when #alienNAIC LIKE '[0-9]%'
then 0 else 1
end
end
If you want to perform the check in a WHERE clause, just use LIKE, not any function, eg:
WHERE alienNAIC NOT LIKE '[0-9]%'
This particular pattern is just a range search that covers all values between 0 and 9....... The server can use an index that covers the text column to quickly identify the matches. It can't do so when it has to check the result of a function. It will have to calculate the value for every single row before filtering.

Find index of last occurrence of a sub-string using T-SQL

Is there a straightforward way of finding the index of the last occurrence of a string using SQL? I am using SQL Server 2000 right now. I basically need the functionality that the .NET System.String.LastIndexOf method provides. A little googling revealed this - Function To Retrieve Last Index - but that does not work if you pass in a "text" column expression. Other solutions found elsewhere work only so long as the text you are searching for is 1 character long.
I will probably have to cook a function up. If I do so, I will post it here so you folks can look at it and maybe make use of.
Straightforward way? No, but I've used the reverse. Literally.
In prior routines, to find the last occurence of a given string, I used the REVERSE() function, followed CHARINDEX, followed again by REVERSE to restore the original order. For instance:
SELECT
mf.name
,mf.physical_name
,reverse(left(reverse(physical_name), charindex('\', reverse(physical_name)) -1))
from sys.master_files mf
shows how to extract the actual database file names from from their "physical names", no matter how deeply nested in subfolders. This does search for only one character (the backslash), but you can build on this for longer search strings.
The only downside is, I don't know how well this will work on TEXT data types. I've been on SQL 2005 for a few years now, and am no longer conversant with working with TEXT -- but I seem to recall you could use LEFT and RIGHT on it?
Philip
The simplest way is....
REVERSE(SUBSTRING(REVERSE([field]),0,CHARINDEX('[expr]',REVERSE([field]))))
If you are using Sqlserver 2005 or above, using REVERSE function many times is detrimental to performance, below code is more efficient.
DECLARE #FilePath VARCHAR(50) = 'My\Super\Long\String\With\Long\Words'
DECLARE #FindChar VARCHAR(1) = '\'
-- text before last slash
SELECT LEFT(#FilePath, LEN(#FilePath) - CHARINDEX(#FindChar,REVERSE(#FilePath))) AS Before
-- text after last slash
SELECT RIGHT(#FilePath, CHARINDEX(#FindChar,REVERSE(#FilePath))-1) AS After
-- the position of the last slash
SELECT LEN(#FilePath) - CHARINDEX(#FindChar,REVERSE(#FilePath)) + 1 AS LastOccuredAt
You are limited to small list of functions for text data type.
All I can suggest is start with PATINDEX, but work backwards from DATALENGTH-1, DATALENGTH-2, DATALENGTH-3 etc until you get a result or end up at zero (DATALENGTH-DATALENGTH)
This really is something that SQL Server 2000 simply can't handle.
Edit for other answers : REVERSE is not on the list of functions that can be used with text data in SQL Server 2000
DECLARE #FilePath VARCHAR(50) = 'My\Super\Long\String\With\Long\Words'
DECLARE #FindChar VARCHAR(1) = '\'
SELECT LEN(#FilePath) - CHARINDEX(#FindChar,REVERSE(#FilePath)) AS LastOccuredAt
Old but still valid question, so heres what I created based on the info provided by others here.
create function fnLastIndexOf(#text varChar(max),#char varchar(1))
returns int
as
begin
return len(#text) - charindex(#char, reverse(#text)) -1
end
REVERSE(SUBSTRING(REVERSE(ap_description),CHARINDEX('.',REVERSE(ap_description)),len(ap_description)))
worked better for me
This worked very well for me.
REVERSE(SUBSTRING(REVERSE([field]), CHARINDEX(REVERSE('[expr]'), REVERSE([field])) + DATALENGTH('[expr]'), DATALENGTH([field])))
Hmm, I know this is an old thread, but a tally table could do this in SQL2000 (or any other database):
DECLARE #str CHAR(21),
#delim CHAR(1)
SELECT #str = 'Your-delimited-string',
#delim = '-'
SELECT
MAX(n) As 'position'
FROM
dbo._Tally
WHERE
substring(#str, _Tally.n, 1) = #delim
A tally table is just a table of incrementing numbers.
The substring(#str, _Tally.n, 1) = #delim gets the position of each delimiter, then you just get the maximum position in that set.
Tally tables are awesome. If you haven't used them before, there is a good article on SQL Server Central.
*EDIT: Removed n <= LEN(TEXT_FIELD), as you can't use LEN() on the TEXT type. As long as the substring(...) = #delim remains though the result is still correct.
This answer uses MS SQL Server 2008 (I don't have access to MS SQL Server 2000), but the way I see it according to the OP are 3 situations to take into consideration. From what I've tried no answer here covers all 3 of them:
Return the last index of a search character in a given string.
Return the last index of a search sub-string (more than just a single
character) in a given string.
If the search character or sub-string is not in the given string return 0
The function I came up with takes 2 parameters:
#String NVARCHAR(MAX) : The string to be searched
#FindString NVARCHAR(MAX) : Either a single character or a sub-string to get the last
index of in #String
It returns an INT that is either the positive index of #FindString in #String or 0 meaning that #FindString is not in #String
Here's an explanation of what the function does:
Initializes #ReturnVal to 0 indicating that #FindString is not in #String
Checks the index of the #FindString in #String by using CHARINDEX()
If the index of #FindString in #String is 0, #ReturnVal is left as 0
If the index of #FindString in #String is > 0, #FindString is in #String so
it calculates the last index of #FindString in #String by using REVERSE()
Returns #ReturnVal which is either a positive number that is the last index of
#FindString in #String or 0 indicating that #FindString is not in #String
Here's the create function script (copy and paste ready):
CREATE FUNCTION [dbo].[fn_LastIndexOf]
(#String NVARCHAR(MAX)
, #FindString NVARCHAR(MAX))
RETURNS INT
AS
BEGIN
DECLARE #ReturnVal INT = 0
IF CHARINDEX(#FindString,#String) > 0
SET #ReturnVal = (SELECT LEN(#String) -
(CHARINDEX(REVERSE(#FindString),REVERSE(#String)) +
LEN(#FindString)) + 2)
RETURN #ReturnVal
END
Here's a little bit that conveniently tests the function:
DECLARE #TestString NVARCHAR(MAX) = 'My_sub2_Super_sub_Long_sub1_String_sub_With_sub_Long_sub_Words_sub2_'
, #TestFindString NVARCHAR(MAX) = 'sub'
SELECT dbo.fn_LastIndexOf(#TestString,#TestFindString)
I have only run this on MS SQL Server 2008 because I don't have access to any other version but from what I've looked into this should be good for 2008+ at least.
Enjoy.
Reverse both your string and your substring, then search for the first occurrence.
If you want to get the index of the last space in a string of words, you can use this expression
RIGHT(name, (CHARINDEX(' ',REVERSE(name),0)) to return the last word in the string. This is helpful if you want to parse out the last name of a full name that includes initials for the first and /or middle name.
Some of the other answers return an actual string whereas I had more need to know the actual index int. And the answers that do that seem to over-complicate things. Using some of the other answers as inspiration, I did the following...
First, I created a function:
CREATE FUNCTION [dbo].[LastIndexOf] (#stringToFind varchar(max), #stringToSearch varchar(max))
RETURNS INT
AS
BEGIN
RETURN (LEN(#stringToSearch) - CHARINDEX(#stringToFind,REVERSE(#stringToSearch))) + 1
END
GO
Then, in your query you can simply do this:
declare #stringToSearch varchar(max) = 'SomeText: SomeMoreText: SomeLastText'
select dbo.LastIndexOf(':', #stringToSearch)
The above should return 23 (the last index of ':')
Hope this made it a little easier for someone!
I realize this is a several years old question, but...
On Access 2010, you can use InStrRev() to do this. Hope this helps.
I know that it will be inefficient but have you considered casting the text field to varchar so that you can use the solution provided by the website you found? I know that this solution would create issues as you could potentially truncate the record if the length in the text field overflowed the length of your varchar (not to mention it would not be very performant).
Since your data is inside a text field (and you are using SQL Server 2000) your options are limited.
#indexOf = <whatever characters you are searching for in your string>
#LastIndexOf = LEN([MyField]) - CHARINDEX(#indexOf, REVERSE([MyField]))
Haven't tested, it might be off by one because of zero index, but works in SUBSTRING function when chopping off from #indexOf characters to end of your string
SUBSTRING([MyField], 0, #LastIndexOf)
This code works even if the substring contains more than 1 character.
DECLARE #FilePath VARCHAR(100) = 'My_sub_Super_sub_Long_sub_String_sub_With_sub_Long_sub_Words'
DECLARE #FindSubstring VARCHAR(5) = '_sub_'
-- Shows text before last substing
SELECT LEFT(#FilePath, LEN(#FilePath) - CHARINDEX(REVERSE(#FindSubstring), REVERSE(#FilePath)) - LEN(#FindSubstring) + 1) AS Before
-- Shows text after last substing
SELECT RIGHT(#FilePath, CHARINDEX(REVERSE(#FindSubstring), REVERSE(#FilePath)) -1) AS After
-- Shows the position of the last substing
SELECT LEN(#FilePath) - CHARINDEX(REVERSE(#FindSubstring), REVERSE(#FilePath)) AS LastOccuredAt
I needed to find the nth last position of a backslash in a folder path.
Here is my solution.
/*
http://stackoverflow.com/questions/1024978/find-index-of-last-occurrence-of-a-sub-string-using-t-sql/30904809#30904809
DROP FUNCTION dbo.GetLastIndexOf
*/
CREATE FUNCTION dbo.GetLastIndexOf
(
#expressionToFind VARCHAR(MAX)
,#expressionToSearch VARCHAR(8000)
,#Occurrence INT = 1 -- Find the nth last
)
RETURNS INT
AS
BEGIN
SELECT #expressionToSearch = REVERSE(#expressionToSearch)
DECLARE #LastIndexOf INT = 0
,#IndexOfPartial INT = -1
,#OriginalLength INT = LEN(#expressionToSearch)
,#Iteration INT = 0
WHILE (1 = 1) -- Poor man's do-while
BEGIN
SELECT #IndexOfPartial = CHARINDEX(#expressionToFind, #expressionToSearch)
IF (#IndexOfPartial = 0)
BEGIN
IF (#Iteration = 0) -- Need to compensate for dropping out early
BEGIN
SELECT #LastIndexOf = #OriginalLength + 1
END
BREAK;
END
IF (#Occurrence > 0)
BEGIN
SELECT #expressionToSearch = SUBSTRING(#expressionToSearch, #IndexOfPartial + 1, LEN(#expressionToSearch) - #IndexOfPartial - 1)
END
SELECT #LastIndexOf = #LastIndexOf + #IndexOfPartial
,#Occurrence = #Occurrence - 1
,#Iteration = #Iteration + 1
IF (#Occurrence = 0) BREAK;
END
SELECT #LastIndexOf = #OriginalLength - #LastIndexOf + 1 -- Invert due to reverse
RETURN #LastIndexOf
END
GO
GRANT EXECUTE ON GetLastIndexOf TO public
GO
Here are my test cases which pass
SELECT dbo.GetLastIndexOf('f','123456789\123456789\', 1) as indexOf -- expect 0 (no instances)
SELECT dbo.GetLastIndexOf('\','123456789\123456789\', 1) as indexOf -- expect 20
SELECT dbo.GetLastIndexOf('\','123456789\123456789\', 2) as indexOf -- expect 10
SELECT dbo.GetLastIndexOf('\','1234\6789\123456789\', 3) as indexOf -- expect 5
To get the part before the last occurence of the delimiter (works only for NVARCHAR due to DATALENGTH usage):
DECLARE #Fullstring NVARCHAR(30) = '12.345.67890.ABC';
DECLARE #Delimiter CHAR(1) = '.';
SELECT SUBSTRING(#Fullstring, 1, DATALENGTH(#Fullstring)/2 - CHARINDEX(#Delimiter, REVERSE(#Fullstring)));
This answer meets the requirements of the OP. specifically it allows the needle to be more than a single character and it does not generate an error when needle is not found in haystack. It seemed to me that most (all?) of the other answers did not handle those edge cases. Beyond that I added the "Starting Position" argument provided by the native MS SQL server CharIndex function. I tried to exactly mirror the specification for CharIndex except to process right to left instead of left to right. eg I return null if either needle or haystack is null and I return zero if needle is not found in haystack. One thing that I could not get around is that with the built in function the third parameter is optional. With SQL Server user defined functions, all parameters must be provided in the call unless the function is called using "EXEC" . While the third parameter must be included in the parameter list, you can provide the keyword "default" as a placeholder for it without having to give it a value (see examples below). Since it is easier to remove the third parameter from this function if not desired than it would be to add it if needed I have included it here as a starting point.
create function dbo.lastCharIndex(
#needle as varchar(max),
#haystack as varchar(max),
#offset as bigint=1
) returns bigint as begin
declare #position as bigint
if #needle is null or #haystack is null return null
set #position=charindex(reverse(#needle),reverse(#haystack),#offset)
if #position=0 return 0
return (len(#haystack)-(#position+len(#needle)-1))+1
end
go
select dbo.lastCharIndex('xyz','SQL SERVER 2000 USES ANSI SQL',default) -- returns 0
select dbo.lastCharIndex('SQL','SQL SERVER 2000 USES ANSI SQL',default) -- returns 27
select dbo.lastCharIndex('SQL','SQL SERVER 2000 USES ANSI SQL',1) -- returns 27
select dbo.lastCharIndex('SQL','SQL SERVER 2000 USES ANSI SQL',11) -- returns 1
I came across this thread while searching for a solution to my similar problem which had the exact same requirement but was for a different kind of database that was also lacking the REVERSE function.
In my case this was for a OpenEdge (Progress) database, which has a slightly different syntax. This made the INSTR function available to me that most Oracle typed databases offer.
So I came up with the following code:
SELECT
INSTR(foo.filepath, '/',1, LENGTH(foo.filepath) - LENGTH( REPLACE( foo.filepath, '/', ''))) AS IndexOfLastSlash
FROM foo
However, for my specific situation (being the OpenEdge (Progress) database) this did not result into the desired behaviour because replacing the character with an empty char gave the same length as the original string. This doesn't make much sense to me but I was able to bypass the problem with the code below:
SELECT
INSTR(foo.filepath, '/',1, LENGTH( REPLACE( foo.filepath, '/', 'XX')) - LENGTH(foo.filepath)) AS IndexOfLastSlash
FROM foo
Now I understand that this code won't solve the problem for T-SQL because there is no alternative to the INSTR function that offers the Occurence property.
Just to be thorough I'll add the code needed to create this scalar function so it can be used the same way like I did in the above examples.
-- Drop the function if it already exists
IF OBJECT_ID('INSTR', 'FN') IS NOT NULL
DROP FUNCTION INSTR
GO
-- User-defined function to implement Oracle INSTR in SQL Server
CREATE FUNCTION INSTR (#str VARCHAR(8000), #substr VARCHAR(255), #start INT, #occurrence INT)
RETURNS INT
AS
BEGIN
DECLARE #found INT = #occurrence,
#pos INT = #start;
WHILE 1=1
BEGIN
-- Find the next occurrence
SET #pos = CHARINDEX(#substr, #str, #pos);
-- Nothing found
IF #pos IS NULL OR #pos = 0
RETURN #pos;
-- The required occurrence found
IF #found = 1
BREAK;
-- Prepare to find another one occurrence
SET #found = #found - 1;
SET #pos = #pos + 1;
END
RETURN #pos;
END
GO
To avoid the obvious, when the REVERSE function is available you do not need to create this scalar function and you can just get the required result like this:
SELECT
LEN(foo.filepath) - CHARINDEX('/', REVERSE(foo.filepath))+1 AS LastIndexOfSlash
FROM foo
handles lookinng for something > 1 char long.
feel free to increase the parm sizes if you like.
couldnt resist posting
drop function if exists lastIndexOf
go
create function lastIndexOf(#searchFor varchar(100),#searchIn varchar(500))
returns int
as
begin
if LEN(#searchfor) > LEN(#searchin) return 0
declare #r varchar(500), #rsp varchar(100)
select #r = REVERSE(#searchin)
select #rsp = REVERSE(#searchfor)
return len(#searchin) - charindex(#rsp, #r) - len(#searchfor)+1
end
and tests
select dbo.lastIndexof('greg','greg greg asdflk; greg sadf' ) -- 18
select dbo.lastIndexof('greg','greg greg asdflk; grewg sadf' ) --5
select dbo.lastIndexof(' ','greg greg asdflk; grewg sadf' ) --24
This thread has been going for a while. I'll offer a solution covering different basis with example:
declare #aStringData varchar(100) = 'The quick brown/fox jumps/over the/lazy dog.pdf'
/*
The quick brown/fox jumps/over the/lazy dog.pdf
fdp.god yzal/eht revo/spmuj xof/nworb kciuq ehT
*/
select
Len(#aStringData) - CharIndex('/', Reverse(#aStringData)) + 1 [Char Index],
-- Get left side of character, without the character '/'
Left(#aStringData, Len(#aStringData) - CharIndex('/', Reverse(#aStringData))) [Left excluding char],
-- Get left side of character, including the character '/'
Left(#aStringData, Len(#aStringData) - CharIndex('/', Reverse(#aStringData)) + 1) [Left including char],
-- Get right side of character, without the character '/'
Right(#aStringData, CharIndex('/', Reverse(#aStringData)) - 1) [Right including char]
To get char position, need to reverse the string as CharIndex gets the first occurrence. Remembering as we're reversing, the CharIndex cursor will land on the other side of the character we're finding. So expect to compensate by -1 or +1, depending if wanting to get left or right side portion of string.

Resources