sql like statement picking up unexpected results - sql-server

I have a simple table like the following
id, target
-----------
1, test_1
2, test_2
3, test_3
4, testable
I have a simple query like so:
select * from my_table where target like 'test_%'
What I'm expecting are the first 3 records but I'm getting all 4 records
See SQLFiddle example here

Underscore is a pattern matching character. Try this:
select * from my_table where target like 'test[_]%'

_ is also a wildcard. You can escape it like:
... like 'test\_%' escape '\'

The underscore character _ as you've used it is a wildcard for a single character, hence it returns 4 rows. Try using [_] instead of _.
To illustrate..
CREATE TABLE #tmp (val varchar(10))
INSERT INTO #tmp (val)
VALUES ('test_1'), ('test_2'), ('test_3'), ('testing')
-- This returns all four
SELECT * FROM #tmp WHERE val LIKE 'test_%'
-- This returns the three test_ rows
SELECT * FROM #tmp WHERE val LIKE 'test[_]%'

The underscore is a wildcard character that says "match any character single character", just like the % is a wildcard that says "match any 0 or more characters". If you're familiar with Regular Expressions, the underscore character is equivalent to the dot there. You'll need to properly escape the underscore to match that character literally.

Related

Is there a way to find values that contain only 0's and a symbol of any length?

I want to find strings of any length that contain only 0's and a symbol such as a / a . or a -
Examples include 0__0 and 000/00/00000 and .00000
Considering this sample data:
CREATE TABLE dbo.things(thing varchar(255));
INSERT dbo.things(thing) VALUES
('0__0'),('000/00/00000'),('00000'),('0123456');
Try the following, which locates the first position of any character that is NOT a 0, a decimal, a forward slash, or an underscore. PATINDEX returns 0 if the pattern is not found.
SELECT thing FROM dbo.things
WHERE PATINDEX('%[^0^.^/^_]%', thing) = 0;
Results:
thing
0__0
000/00/00000
00000
The opposite:
SELECT thing FROM dbo.things
WHERE PATINDEX('%[^0^.^/^_]%', thing) > 0;
Results:
thing
0123food456
Example db<>fiddle
I can see a way of doing this... But it's something that wouldn't perform well, if you think about using it as a search criteria.
We are going to use a translate function on SQL Server, to replace the allowed characters, or symbols as you've said, with a zero. And then, eliminates the zeroes. If the result is an empty string, then there are two cases, or it only had zeroes and allowed characters, or it already was an empty string.
So, checking for this and for non-empty strings, we can define if it matches your criteria.
-- Test scenario
create table #example (something varchar(200) )
insert into #example(something) values
--Example cases from Stack Overflow
('0__0'),('000/00/00000'),('.00000'),
-- With something not allowed (don't know, just put a number)
('1230__0'),('000/04560/00000'),('.00000789'),
-- Just not allowed characters, zero, blank, and NULL
('1234567489'),('0'), (''),(null)
-- Shows the data, with a column to check if it matches your criteria
select *
from #example e
cross apply (
select case when
-- If it *must* have at least a zero
e.something like '%0%' and
-- Eliminates zeroes
replace(
-- Replaces the allowed characters with zero
translate(
e.something
,'_./'
,'000'
)
,'0'
,''
) = ''
then cast(1 as bit)
else cast(0 as bit)
end as doesItMatch
) as criteria(doesItMatch)
I really discourage you from using this as a search criteria.
-- Queries the table over this criteria.
-- This is going to compute over your entire table, so it can get very CPU intensive
select *
from #example e
where
-- If it *must* have at least a zero
e.something like '%0%' and
-- Eliminates zeroes
replace(
-- Replaces the allowed characters with zero
translate(
e.something
,'_./'
,'000'
)
,'0'
,''
) = ''
If you must use this as a search criteria, and this will be a common filter on your application, I suggest you create a new bit column, to flag if it matches this, and index it. Thus, the increase in computational effort would be spread on the inserts/updates/deletes, and the search queries won't overloading the database.
The code can be seen executing here, on DB Fiddle.
What I got from the question is that the strings must contain both 0 and any combination of the special characters in the string.
If you have SQL Server 2017 and above, you can use translate() to replace multiple characters with a space and compare this with the empty string. Also you can use LIKE to enforce that both a 0 and any combination of the special character(s) appear at least once:
DECLARE #temp TABLE (val varchar(100))
INSERT INTO #temp VALUES
('0__0'), ('000/00/00000'), ('.00000'), ('w0hee/'), ('./')
SELECT *
FROM #temp
WHERE val LIKE '%0%' --must have at least one zero somewhere
AND val LIKE '%[_/.]%' --must have at least one special character(s) somewhere
AND TRANSLATE(val, '0./_', ' ') = '' --translated zeros and sp characters to spaces equivalent to an empty string
Creates output:
val
0__0
000/00/00000
.00000

SQL - Replace string function is not working as intended

I have a simple string; for example,'01023201580001'.
I would like to replace the last two characters of this string; '01', with '00'.
I could extract the last two characters from this string as RIGHT(columname,2) and then use
REPLACE([columname], RIGHT([columname], 2), '00') as newColumnString
But in the result, it replaces the first two characters as well?
Expected result: 01023201580000
Result I get: 00023201580000
What am I doing wrong?
The second argument to the replace() function defines a pattern to match. The function will look for all instances of that pattern in the target string (first argument) and replace them with the replacement text (third argument).
If you know you only need to change the last two characters, you can take the value excluding those characters and then append the characters you want:
select left(columname, len(columname) - 2) + '00';
If you are doing this for an entire column and some of the rows might not end with '01', you can filter those out:
update MyTable
set columname = left(columname, len(columname) - 2) + '00'
where columname like '%01';
You could also use stuff() in a similar way.
In SQL server, you can use substring like so:
DECLARE #s NVARCHAR(20) = N'01023201580001';
DECLARE #ReplaceWith NVARCHAR(20) = N'00';
SELECT SUBSTRING(#s, 0, LEN(#s) - 1) + #ReplaceWith;
Output: 01023201580000

SQL Server - remove left part of string before a specific character

I have a VARCHAR value that looks like this:
5.95 $ Additional fees
How can I remove everything left from character '$' (including that character) ? So that I get the following result:
Additional fees
The '$' is always present.
STUFF and CHARINDEX would be the simpliest way, in my opinion:
SELECT STUFF(YourColumn,1, CHARINDEX('$',YourColumn),'')
FROM (VALUES('5.95 $ Additional fees'))V(YourColumn);
Note that as $ has a whitespace afterwards, the value returned will have a leading whitespace (' Additional fees'). You could use TRIM (or LTRIM and RTRIM on older versions of SQL Server) to remove this, if it isn't wanted.
I haven't assumed that the portion string to be replaced is CHARINDEX('$',YourColumn)+1, as we have one sample. As far as we know, you could also have values such as '10.99$Base Cost'. If the +1 was used, it would return 'ase Cost' for such a value.
Hello do it like below syntax
declare #temp nvarchar(max)='5.95 $ Additional fees'
select SUBSTRING(#temp,charindex('$',#temp)+1,len(#temp)-1)
You can use SUBSTRING get the particular string and CHARINDEX function to get index of special character, in your case $.
DECLARE #Var VARCHAR(100)
SET #Var = '5.95 $ Additional fees'
SELECT SUBSTRING(#Var, CHARINDEX('$', #Var) + 1, LEN(#Var) - LEN(LEFT(#Var, CHARINDEX('$', #Var))))

Sql Server's regex LIKE - behaviour clarification?

Someone asked here how to get only values which are a number :
So , if the table is :
DECLARE #Table TABLE(
Col nVARCHAR(50)
)
INSERT INTO #Table SELECT 'ABC'
INSERT INTO #Table SELECT '234.62'
INSERT INTO #Table SELECT '10:10:10:10'
INSERT INTO #Table SELECT 'France'
INSERT INTO #Table SELECT '2'
then - the desired results are :
234.62
2
But when I tested this query :
SELECT * FROM #Table WHERE Col LIKE '%[0-9.]%' --expected to see only 234.62
it showed :
234.62
10:10:10:10
2
Question #1
How come 10:10:10:10 , 2 satisfies the condition ?
Question #2
I saw this answer here which does work
SELECT * FROM #Table WHERE Col NOT LIKE '%[^0-9.]%'
But I don't understand why this works. AFAIU - it selects all values which are not like (not(has number) and not( has dot)) which is ===>(de morgan)===> not like ( has number or has dot)
Can someone please shed light ?
nb I already know that isnumeric can be used also , but it's unsafe (+). also valid wildcards are %,_,[],[^]
Any particular use of [set] within a LIKE expression is a check against one character in the target string.
So, LIKE '%[0-9.]%' says - % - match 0-to-many arbitrary characters, then [0-9.] match one character in the set 0-9., and then % match 0-to-many arbitrary characters. Paraphrased, it says "match any string that contains at least one character in the set 0-9.". So, 10:10:10:10 can be matched as 0 arbitrary characters, then 1 matches [0-9.], and then 0:10:10:10 matches the final %.
LIKE '%[^0-9.]%' says - % - match 0-to-many arbitrary characters, then [^0-9.] match one character not in the set 0-9., and then % match 0-to-many arbitrary characters. Paraphrased, it says "match any string that contains at least one character outside of the set 0-9.. So when we apply the NOT to the front of that, we are saying "match any string that doesn't contain at least one character outside of the set 0-9." or "match strings that only contain characters in the set 0-9..
Essentially, the double-negative is a way to make an assertion about all characters in the string.

Find all special characters in a column in SQL Server 2008

I need to find the occurrence of all special characters in a column in SQL Server 2008. So, I don't care about A, B, C ... 8, 9, 0, but I do care about !, #, &,, etc.
The easiest way to do so, in my mind, would exclude A, B, C, ... 8, 9, 0, but if I wrote a statement to exclude those, I would miss entries that had ! and A. So, it seems to me that I would have to get a list of every non-alphabet / non-number character, then run a SELECT with a LIKE and Wildcard qualifiers.
Here is what I would run:
SELECT Col1
FROM TABLE
WHERE Col1 LIKE ('!', '#', '#', '$', '%'....)
However, I don't think you can run multiple qualifiers, can you? Is there a way I could accomplish this?
Negatives are your friend here:
SELECT Col1
FROM TABLE
WHERE Col1 like '%[^a-Z0-9]%'
Which says that you want any rows where Col1 consists of any number of characters, then one character not in the set a-Z0-9, and then any number of characters.
If you have a case sensitive collation, it's important that you use a range that includes both upper and lower case A, a, Z and z, which is what I've given (originally I had it the wrong way around. a comes before A. Z comes after z)
Or, to put it another way, you could have written your original WHERE as:
Col1 LIKE '%[!##$%]%'
But, as you observed, you'd need to know all of the characters to include in the [].
The following transact SQL script works for all languages (international). The solution is not to check for alphanumeric but to check for not containing special characters.
DECLARE #teststring nvarchar(max)
SET #teststring = 'Test''Me'
SELECT 'IS ALPHANUMERIC: ' + #teststring
WHERE #teststring NOT LIKE '%[-!#%&+,./:;<=>#`{|}~"()*\\\_\^\?\[\]\'']%' {ESCAPE '\'}
Select * from TableName Where ColumnName LIKE '%[^A-Za-z0-9, ]%'
This will give you all the row which contains any special character.
select count(*) from dbo.tablename where address_line_1 LIKE '%[\'']%' {eSCAPE'\'}

Resources