SQL Server string comparison: nvarchar vs. varchar [duplicate] - sql-server

I have nvarchar(50) column in SQL Server table and data like this:
123abc
234abc
456abc
My query:
select *
from table
where col like '%abc'
Expected result : all rows should be returned
Actual result: No rows are returned
Works fine if the column is varchar but returns no rows if the type is nvarchar.
Any ideas?

You probably have spaces at the end of your data. Take a look at this example.
Declare #Temp Table(col nvarchar(50))
Insert Into #Temp(col) Values(N'123abc')
Insert Into #Temp(col) Values(N'456abc ')
Select * From #Temp Where Col Like '%abc'
When you run the code above, you will only get the 123 row because the 456 row has a space on the end of it.
When you run the code shown below, you will get the data you expect.
Declare #Temp Table(col nvarchar(50))
Insert Into #Temp(col) Values(N'123abc')
Insert Into #Temp(col) Values(N'456abc ')
Select * From #Temp Where rtrim(Col) Like '%abc'
According to the documentation regarding LIKE in books on line (emphasis mine):
http://msdn.microsoft.com/en-us/library/ms179859.aspx
Pattern Matching by Using LIKE
LIKE supports ASCII pattern matching and Unicode pattern matching. When all arguments (match_expression, pattern, and escape_character, if present) are ASCII character data types, ASCII pattern matching is performed. If any one of the arguments are of Unicode data type, all arguments are converted to Unicode and Unicode pattern matching is performed. When you use Unicode data (nchar or nvarchar data types) with LIKE, trailing blanks are significant; however, for non-Unicode data, trailing blanks are not significant. Unicode LIKE is compatible with the ISO standard. ASCII LIKE is compatible with earlier versions of SQL Server.

for nvarchar type you can use select like this :-
select * from Table where ColumnName like N'%abc%'

Are you sure there are no spaces at the end of the value? You can do this to remove the white space:
select *
from yourTable
where rtrim(yourcolumn) like '%abc'
If you don't want to use the RTRIM and the LIKE together you can also use:
Select *
From yourTable
Where charindex('abc', col) > 0
From Microsoft about using LIKE:
SQL Server follows the ANSI/ISO SQL-92 specification (Section 8.2,
, General rules #3) on how to compare strings
with spaces. The ANSI standard requires padding for the character
strings used in comparisons so that their lengths match before
comparing them. The padding directly affects the semantics of WHERE
and HAVING clause predicates and other Transact-SQL string
comparisons. For example, Transact-SQL considers the strings 'abc' and
'abc ' to be equivalent for most comparison operations.
The only exception to this rule is the LIKE predicate. When the right
side of a LIKE predicate expression features a value with a trailing
space, SQL Server does not pad the two values to the same length
before the comparison occurs. Because the purpose of the LIKE
predicate, by definition, is to facilitate pattern searches rather
than simple string equality tests, this does not violate the section
of the ANSI SQL-92 specification mentioned earlier.
If you know the starting position of the 'abc' string then you can use SUBSTRING:
Select *
From yourTable
Where substring(Col, 4, 3) = 'abc'
But then you can use charindex and substring together and you do not have to worry about white space:
select *
from yourTable
where substring(col, charindex('abc', col), 3) = 'abc'

Your query should work just fine, but you can also try.
SELECT * FROM TABLE WHERE (COL LIKE '%abc%')
In case there are characters you cannot see after the 'abc' part.

This will work fine. You can try.
SELECT *
FROM "table"
WHERE CAST("col" AS VARCHAR) LIKE '%abc'

Just to document some of the ASCII vs Unicode weirdness:
-- ascii like
if 'Non rational ' like 'Non[ --]rational%' print 'like' else print 'not like'
-- unicode like
if N'Non rational ' like N'Non[ --]rational%' print 'like' else print 'not like'
-- unicode like, trailing space removed
if N'Non rational' like N'Non[ --]rational%' print 'like' else print 'not like'
-- unicode like, different wildcard
if N'Non rational ' like N'Non_rational%' print 'like' else print 'not like'
Produces the following:
like
not like
not like
like

Related

SQL Server returns wrong result with trailing spaces in Where clause [duplicate]

In SQL Server 2008 I have a table called Zone with a column ZoneReference varchar(50) not null as the primary key.
If I run the following query:
select '"' + ZoneReference + '"' as QuotedZoneReference
from Zone
where ZoneReference = 'WF11XU'
I get the following result:
"WF11XU "
Note the trailing space.
How is this possible? If the trailing space really is there on that row, then I'd expect to return zero results, so I'm assuming it's something else that SQL Server Management Studio is displaying weirdly.
In C# code calling zoneReference.Trim() removes it, suggesting it is some sort of whitespace character.
Can anyone help?
That's the expected result: in SQL Server the = operator ignores trailing spaces when making the comparison.
SQL Server follows the ANSI/ISO SQL-92 specification (Section 8.2, , General rules #3) on how to compare strings with spaces. The ANSI standard requires padding for the character strings used in comparisons so that their lengths match before comparing them. The padding directly affects the semantics of WHERE and HAVING clause predicates and other Transact-SQL string comparisons. For example, Transact-SQL considers the strings 'abc' and 'abc ' to be equivalent for most comparison operations.
The only exception to this rule is the LIKE predicate. When the right side of a LIKE predicate expression features a value with a trailing space, SQL Server does not pad the two values to the same length before the comparison occurs. Because the purpose of the LIKE predicate, by definition, is to facilitate pattern searches rather than simple string equality tests, this does not violate the section of the ANSI SQL-92 specification mentioned earlier.
Source
Trailing spaces are not always ignored.
I experienced this issue today. My table had NCHAR columns and was being joined to VARCHAR data.
Because the data in the table was not as wide as its field, trailing spaces were automatically added by SQL Server.
I had an ITVF (inline table-valued function) that took varchar parameters.
The parameters were used in a JOIN to the table with the NCHAR fields.
The joins failed because the data passed to the function did not have trailing spaces but the data in the table did. Why was that?
I was getting tripped up on DATA TYPE PRECEDENCE. (See http://technet.microsoft.com/en-us/library/ms190309.aspx)
When comparing strings of different types, the lower precedence type is converted to the higher precedence type before the comparison. So my VARCHAR parameters were converted to NCHARs. The NCHARs were compared, and apparently the spaces were significant.
How did I fix this? I changed the function definition to use NVARCHAR parameters, which are of a higher precedence than NCHAR. Now the NCHARs were changed automatically by SQL Server into NVARCHARs and the trailing spaces were ignored.
Why didn't I just perform an RTRIM? Testing revealed that RTRIM killed the performance, preventing the JOIN optimizations that SQL Server would have otherwise used.
Why not change the data type of the table? The tables are already installed on customer sites, and they do not want to run maintenance scripts (time + money to pay DBAs) or give us access to their machinines (understandable).
Yeah, Mark is correct. Run the following SQL:
create table #temp (name varchar(15))
insert into #temp values ('james ')
select '"' + name + '"' from #temp where name ='james'
select '"' + name + '"' from #temp where name like 'james'
drop table #temp
But, the assertion about the 'like' statement appears not to work in the above example. Output:
(1 row(s) affected)
-----------------
"james "
(1 row(s) affected)
-----------------
"james "
(1 row(s) affected)
EDIT:
To get it to work, you could put at the end:
and name <> rtrim(ltrim(name))
Ugly though.
EDIT2:
Given the comments abovem, the following would work:
select '"' + name + '"' from #temp where 'james' like name
try
select Replace('"' + ZoneReference + '"'," ", "") as QuotedZoneReference from Zone where ZoneReference = 'WF11XU'

Is there a SQL Server collation option that will allow matching different apostrophes?

I'm currently using SQL Server 2016 with SQL_Latin1_General_CP1_CI_AI collation. As expected, queries with the letter e will match values with the letters e, è, é, ê, ë, etc because of the accent insensitive option of the collation. However, queries with a ' (U+0027) do not match values containing a ’ (U+2019). I would like to know if such a collation exists where this case would match, since it's easier to type ' than it is to know that ’ is keystroke Alt-0146.
I'm confident in saying no. The main thing, here, is that the two characters are different (although similar). With accents, e and ê are still both an e (just one has an accent). This enables you (for example) to do searches for things like SELECT * FROM Games WHERE [Name] LIKE 'Pokémon%'; and still have rows containing Pokemon return (because people haven't used the accent :P).
The best thing I could suggest would be to use REPLACE (at least in your WHERE clause) so that both rows are returned. That is, however, likely going to get expensive.
If you know what columns are going to be a problem, you could, therefore, add a PERSISTED Computed Column to that table. Then you could use that column in your WHERE clause, but display the one the original one. Something like:
USE Sandbox;
--Create Sample table and data
CREATE TABLE Sample (String varchar(500));
INSERT INTO Sample
VALUES ('This is a string that does not contain either apostrophe'),
('Where as this string, isn''t without at least one'),
('’I have one of them as well’'),
('’Well, I''m going to use both’');
GO
--First attempt (without the column)
SELECT String
FROM Sample
WHERE String LIKE '%''%'; --Only returns 2 of the rows
GO
--Create a PERSISTED Column
ALTER TABLE Sample ADD StringRplc AS REPLACE(String,'’','''') PERSISTED;
GO
--Second attempt
SELECT String
FROM Sample
WHERE StringRplc LIKE '%''%'; --Returns 3 rows
GO
--Clean up
DROP TABLE Sample;
GO
The other answer is correct. There is no such collation. You can easily verify this with the below.
DECLARE #dynSql NVARCHAR(MAX) =
'SELECT * FROM (' +
(
SELECT SUBSTRING(
(
SELECT ' UNION ALL SELECT ''' + name + ''' AS name, IIF( NCHAR(0x0027) = NCHAR(0x2019) COLLATE ' + name + ', 1,0) AS Equal'
FROM sys.fn_helpcollations()
FOR XML PATH('')
), 12, 0+ 0x7fffffff)
)
+ ') t
ORDER BY Equal, name';
PRINT #dynSql;
EXEC (#dynSql);

How can I make LIKE match a number or empty string inside square brackets in T-SQL?

Is it possible to have a LIKE clause with one character number or an empty string?
I have a field in which I will write a LIKE clause (as a string). I will apply it later with an expression in the WHERE clause: ... LIKE tableX.FormatField .... It must contain a number (a single character or an empty string).
Something like [0-9 ]. Where the space bar inside square brackets means an empty string.
I have a table in which I have a configuration for parameters - TblParam with field DataFormat. I have to validate a value from another table, TblValue, with field ValueToCheck. The validation is made by a query. The part for the validation looks like:
... WHERE TblValue.ValueToCheck LIKE TblParam.DataFormat ...
For the configuration value, I need an expression for one numeric character or an empty string. Something like [0-9'']. Because of the automatic nature of the check, I need a single expression (without AND OR OR operators) which can fit the query (see the example above). The same check is valid for other types of the checks, so I have to fit my check engine.
I am almost sure that I can not use [0-9''], but is there another suitable solution?
Actually, I have difficulty to validate a version string: 1.0.1.2 or 1.0.2. It can contain 2-3 dots (.) and numbers.
I am pretty sure it is not possible, as '' is not even a character.
select ascii(''); returns null.
'' = ' '; is true
'' is null; is false
If you want exactly 0-9 '' (and not ' '), then you do to something like this (in a more efficient way than like):
where col in ('1','2','3','4','5','6','7','9','0') or (col = '' and DATALENGTH(col) = 0)
That's a tricky one... As far as I can tell, there isn't a way to do it with only one like clause. You need to do like '[0-9]' OR like ''.
You could accomplish this by having a second column in your TableX. That indicates either a second pattern, or whether or not to include blanks.
If I correctly understand your question, you need something that catches an empty string. Try to use the nullif() function:
create table t1 (a nvarchar(1))
insert t1(a) values('')
insert t1(a) values('1')
insert t1(a) values('2')
insert t1(a) values('a')
-- must select first three
select a from t1 where a like '[0-9]' or nullif(a,'') is null
It returns exactly three records: '', '1' and '2'.
A more convenient method with only one range clause is:
select a from t1 where isnull(nullif(a,''),0) like '[0-9]'

WHERE clause on VARCHAR column seems to operate as a LIKE

I've stumbled across a situation I've never seen before. I hope that someone can explain the following.
I've ran the following query, hoping to get only the results of columns whoes value is exactly equal to 1101
select '--' + MyColumn + '--' SeeSpaces, Len(MyColumn) as LengthOfColumn
from MyTable
where MyColumn = '1101'
However, I also see values where 1101 is followed by (what I believe are) spaces.
So SeeSpaces returns
--1101 --
And LengthOfColumn returns 4
MyColumn is a VARCHAR(8), NOT NULL column. Its values (including the spaces) are inserted through a separate workflow.
Why does this select not return only the exact results?
Thanks in advance
The reason is to do with the way that SQL server compares strings with trailing spaces, it follows the ANSI standard and so the strings '1101' and '1101 ' are equivalent.
See the following for more details:
INF: How SQL Server Compares Strings with Trailing Spaces
I think you have to use LTRIM() and RTRIM() function while comparing like :
LTRIM(RTRIM(MYCOLUMN))='1101'
Also LEN function does not count spaces, it only count characters in string. Please refere : http://msdn.microsoft.com/en-us/library/ms190329%28SQL.90%29.aspx

How can I make SQL Server return FALSE for comparing varchars with and without trailing spaces?

If I deliberately store trailing spaces in a VARCHAR column, how can I force SQL Server to see the data as mismatch?
SELECT 'foo' WHERE 'bar' = 'bar '
I have tried:
SELECT 'foo' WHERE LEN('bar') = LEN('bar ')
One method I've seen floated is to append a specific character to the end of every string then strip it back out for my presentation... but this seems pretty silly.
Is there a method I've overlooked?
I've noticed that it does not apply to leading spaces so perhaps I run a function which inverts the character order before the compare.... problem is that this makes the query unSARGable....
From the docs on LEN (Transact-SQL):
Returns the number of characters of the specified string expression, excluding trailing blanks. To return the number of bytes used to represent an expression, use the DATALENGTH function
Also, from the support page on How SQL Server Compares Strings with Trailing Spaces:
SQL Server follows the ANSI/ISO SQL-92 specification on how to compare strings with spaces. The ANSI standard requires padding for the character strings used in comparisons so that their lengths match before comparing them.
Update: I deleted my code using LIKE (which does not pad spaces during comparison) and DATALENGTH() since they are not foolproof for comparing strings
This has also been asked in a lot of other places as well for other solutions:
SQL Server 2008 Empty String vs. Space
Is it good practice to trim whitespace (leading and trailing)
Why would SqlServer select statement select rows which match and rows which match and have trailing spaces
you could try somethign like this:
declare #a varchar(10), #b varchar(10)
set #a='foo'
set #b='foo '
select #a, #b, DATALENGTH(#a), DATALENGTH(#b)
Sometimes the dumbest solution is the best:
SELECT 'foo' WHERE 'bar' + 'x' = 'bar ' + 'x'
So basically append any character to both strings before making the comparison.
After some search the simplest solution i found was in Anthony Bloesch
WebLog.
Just add some text (a char is enough) to the end of the data (append)
SELECT 'foo' WHERE 'bar' + 'BOGUS_TXT' = 'bar ' + 'BOGUS_TXT'
Also works for 'WHERE IN'
SELECT <columnA>
FROM <tableA>
WHERE <columnA> + 'BOGUS_TXT' in ( SELECT <columnB> + 'BOGUS_TXT' FROM <tableB> )
The approach I’m planning to use is to use a normal comparison which should be index-keyable (“sargable”) supplemented by a DATALENGTH (because LEN ignores the whitespace). It would look like this:
DECLARE #testValue VARCHAR(MAX) = 'x';
SELECT t.Id, t.Value
FROM dbo.MyTable t
WHERE t.Value = #testValue AND DATALENGTH(t.Value) = DATALENGTH(#testValue)
It is up to the query optimizer to decide the order of filters, but it should choose to use an index for the data lookup if that makes sense for the table being tested and then further filter down the remaining result by length with the more expensive scalar operations. However, as another answer stated, it would be better to avoid these scalar operations altogether by using an indexed calculated column. The method presented here might make sense if you have no control over the schema , or if you want to avoid creating the calculated columns, or if creating and maintaining the calculated columns is considered more costly than the worse query performance.
I've only really got two suggestions. One would be to revisit the design that requires you to store trailing spaces - they're always a pain to deal with in SQL.
The second (given your SARG-able comments) would be to add acomputed column to the table that stores the length, and add this column to appropriate indexes. That way, at least, the length comparison should be SARG-able.

Resources