SQL Server 2014 Case Sensitivity issue - sql-server

I am migrating a database and etl from MySQl to SQL Server and have hit a case sensitivity issue.
In MySql the DB is setup as case sensitive because one of the applications we are loading from has codes like 'Divh' and 'divh' in the one set (its not my doing)
all is well and the select statements all over the place in etl, queries reports etc have all used whatever the author wanted regarding case - some are all UPPER some all lower most mixed.
So, in other words MYSql has case-insensitive DDL and SQL but allows case sensitive data.
It doesn't look like SQL Server can accommodate this. If I choose a CI collation all the tables and columns are insensitive along with the data (presumably).
and the converse - If its CS everything is case-sensitive.
Am I reading it right ?
If so then I either have to change the collation of every text column in the DB
OR edit each and every query.
Ironically the 1st test was to an Azure SQL Database which was set up with the same collation (SQL_Latin1_General_CP1_CS_AS)
and it doesn't care about the case of the table name in a select.
Any ideas ?
Thanks
JC

Firstly are you aware that collation settings exist at every level in SQL Server; Instance, database, table and even field level.
It sounds like you just want to enforce the case sensitive collation for the affected fields leaving the database and DDL as a whole case insensitive.
Another trick i've used in the past is to cast values to a VARBINARY data type if you want to do data comparisions between different cases, but without the need to change the collation of anything.
For example:
DECLARE #Var1 VARCHAR(5)
DECLARE #Var2 VARCHAR(5)
SET #Var1 = 'Divh'
SET #Var2 = 'divh'
--Comparison1:
IF #Var1 = #Var2
PRINT 'Same'
ELSE
PRINT 'Not the same'
--Comparison2:
IF CAST(#Var1 AS VARBINARY) = CAST(#Var2 AS VARBINARY)
PRINT 'Same'
ELSE
PRINT 'Not the same'

Related

Does Snowflake support case insensitive where clause filter similar to SQL Server

I am migrating SQL code to snowflake and during migration i found that by default snowflake is comparing varchar field (ex. select 1 where 'Hello' = 'hello') incorrectly. To solve this problem i set collation 'en-ci' at account level. However now i am not able to use REPLACE like crucial function.
Is it possible in snowflake to do case insensitive varchar comparison (without mentioning collation explicitly or using UPPER function every time) and still use replace function?
I will appreciate any help.
Thanks,
you can compare the test with a case intensive match via ILIKE
select 1 where 'Hello' ilike 'hello'
regexp_replace is your friend, it allows for 'i' parameter that stands for "ignore case":
https://docs.snowflake.com/en/sql-reference/functions/regexp_replace.html
For example you can do something like that:
select regexp_replace('cats are grey, cAts are Cats','cats','dogs',1,0,'i');
I've assumed the default values for position and occurrence but those can also be adjusted
And you can still do comparison (also based on regexp, aka "RLIKE"):
https://docs.snowflake.com/en/sql-reference/functions/rlike.html
Snowflake supports COLLATE:
SELECT 1 WHERE 'Hello' = 'hello' COLLATE 'en-ci';
-- 1
SELECT 'Hello' = 'hello'
,'Hello' = 'hello' COLLATE 'en-ci';
Output:
The collation could be setup at account/database/schema/table level with parameter DEFAULT_DDL_COLLATION:
Sets the default collation used for the following DDL operations:
CREATE TABLE
ALTER TABLE … ADD COLUMN
Setting this parameter forces all subsequently-created columns in the affected objects (table, schema, database, or account) to have the specified collation as the default, unless the collation for the column is explicitly defined in the DDL.

SQL Server Collation problem (Msg 468, Level 16, State 9) in IF statement

My problem is occurring with a "simple-as-it-gets" IF statement, making the suggested fixes to many similar questions (e.g. Cannot resolve the collation conflict in my query) seemingly useless.
The error message is :
Msg 468, Level 16, State 9, Procedure #XYZ, Line 11
Cannot resolve the collation conflict between "Latin1_General_CI_AS" and "SQL_Latin1_General_CP1_CI_AS" in the equal to operation.
It's known that the server collation is set to SQL_Latin1_General_CP1_CI_AS.
This query demonstrates the problem :
-- this procedure (which gets put into tempdb) is called WITHOUT specifying #Choice
CREATE PROCEDURE #XYZ
(
-- all other parameters removed (none of them have default values)
#Choice AS NVARCHAR(1) = 'Y'
)
AS
BEGIN
IF (#choice = 'Y') -- error raised here
BEGIN
DECLARE #NULL_STATEMENT AS int -- only here because there's no "do nothing" statement
END
RETURN
END
How can I fix this, given that altering the default collation of the server (and/or all of the tables) is NOT going to happen AND it is impractical to insert "COLLATE DATABASE_DEFAULT" in all queries, tables, etc. (for this solution see https://www.mssqltips.com/sqlservertip/4395/understanding-the-collate-databasedefault-clause-in-sql-server/ and Cannot resolve the collation conflict between temp table and sys.objects).
Closely related links:
Documentation of the COLLATE clause: https://learn.microsoft.com/en-us/sql/t-sql/statements/collations?view=sql-server-2017
A solution that I probably cannot use: https://www.mssqltips.com/sqlservertip/2901/how-to-change-server-level-collation-for-a-sql-server-instance/
(From the currently accepted answer):
The fix was to change the default parameter from ASCII (varchar) to nvarchar (UTF-8) form
No, that did not fix it. That particular change had no effect since the value was converted to NVARCHAR when the T-SQL was parsed (due to the parameter/variable being NVARCHAR) and this error happens at compile time.
This issue has absolutely nothing to do with NVARCHAR, UTF-8, or even parameter default values.
UTF-8 is not being used here. SQL Server is only seeing UTF-16 as that is what Unicode strings are transferred in by the driver (i.e. the TDS / tabular data stream). And even when using a UTF-8 collation (new in SQL Server 2019), that encoding is only used with VARCHAR types as NVARCHAR is only ever UTF-16 (Little Endian).
The issue you ran into is one of several "odd" behaviors found in temporary stored procedures (both local and global). For temporary stored procs, parameters and variables will always have a collation matching the tempdb collation, while string literals will use the collation of the database where the CREATE PROCEDURE statement was executed. These two collations do not change (for the main T-SQL context of the module) even if you use another database that has a different default collation other than [tempdb] and the database where the temporary proc was created (though dynamic SQL executed in a temporary stored procedure will use the current DB's collation! Fun, eh?).
Thus, as Martin Smith said in a comment on the question, you must have changed the "current" / "active" database when executing the CREATE PROCEDURE statement after prefixing the parameter value with N.
The following example is a simplified version of the code shown in the question, but clearly shows that prefixing the string literals with N does not prevent the error.
Execute the following T-SQL in a database that has a different collation than [tempdb]:
-- The two returned collations need to be different, else no error with CREATE PROC:
SELECT DATABASEPROPERTYEX(N'tempdb', 'collation') AS [tempdb collation],
DATABASEPROPERTYEX(DB_NAME(), 'collation') AS [current DB collation];
SET NOEXEC ON;
GO
CREATE PROCEDURE #XYZ
(
#Choice NVARCHAR(5) = N'Y'
)
AS
BEGIN
IF (#Choice = N'Y') PRINT 'yup';
END;
GO
SET NOEXEC OFF;
/*
Msg 468, Level 16, State 9, Procedure #XYZ, Line XXXXX [Batch Start Line YYYYY]
Cannot resolve the collation conflict between "{current_DB_collation}" and
"{tempdb_collation}" in the equal to operation.
*/
(From the currently accepted answer):
The weird thing is that after running that version I was able to remove the leading N and the query worked without problems.
Correct. That's due to the N prefix not actually having anything to do with the error or fixing it. You were simply in a DB that had a collation of SQL_Latin1_General_CP1_CI_AS which matched the [tempdb] collation.
I'm working on a blog post detailing several odd behaviors with temporary stored procedures, including this collation stuff. If / when I ever finish it, I will try to remember to update this answer with a link to it.
I've summarized the answer provided by #Sean Lange (a variation of #Lamak's comment) as it was was deleted before I could accept it.
The problem (for details see comments on original question) was that I hit a gotcha! while moving the working code to a new server. The fix was to change the default parameter from ASCII (varchar) to nvarchar (UTF-8) form :
CREATE PROCEDURE #XYZ
(
-- all other parameters removed (none of them have default values)
#Choice AS NVARCHAR(1) = N'Y' -- Note the leading N
)
AS
....
The wierd thing is that after running that version I was able to remove the leading N and the query worked without problems.

How to define SQL Server colum name case insensitive but values case sensitive

We just migrated some databases to a new SQL Server 2012 and got some problems with sensitivity.
We would like table & column names to be case insensitive but values should be case sensitive, so
select ... where 'a'='A'
should not return any row, but
select Column from Table
select column from table
should both work.
We tried changing the database (server default) from
Modern_Spanish_CI_AS -> 'a'='A' is true, which we don't want to be, to
Modern_Spanish_CS_AS -> the column/table names must match the defined case
Is there any way to get the desired behavior?
If you choose a case-sensitive collation you must ensure that your your queries are case-sensitive because collation applies to metadata as well as user-data .
You can get round the problem by making the database's collation case-insensitive and using the COLLATE clause when creating tables, or alternatively use a contained database.
Read more about Contained Databases and Contained Database Collations

Allow special characters SQL Server 2008

I am using SQL Server 2008 express edition and its collation settings are set to default.I wish to store special characeters like á ,â ,ã ,å ,ā ,ă ,ą ,ǻ in my database but it converts them into normal characters like 'a'. How can I stop SQL Server from doing so?
Make sure that your columns are using the type nvarchar(...), rather than varchar(...). The former is Unicode, the latter is ASCII.
Also, make sure that your database default collation is set to Accent Sensitive, and that your columns are stored that way. You may also want to check your instance default collation, as that affects the default collation for your system databases, particularly tempdb.
Rahul, here is a very simple query that runs perfectly on SQL 2005 and 2008:
Query
DECLARE #t1 TABLE (
Col1 nvarchar(30)
)
INSERT INTO #t1 VALUES (N'á ,â ,ã ,å ,ā ,ă ,ą ,ǻ')
SELECT * FROM #t1
Result
Col1
------------------------------
á ,â ,ã ,å ,ā ,ă ,ą ,ǻ
There is nothing special here. No collation change from default, just a simple NVARCHAR column.
You said you are "just running direct queries in the database". Can you try this query and see if you get the same results?

How do I change SQL Server 2005 to be case sensitive?

I hate case sensitivity in databases, but I'm developing for a client who uses it. How can I turn on this option on my SQL Server, so I can be sure I've gotten the case right in all my queries?
You don't actually need to change the collation on the entire database, if you declare it on the table or columns that need to be case-sensitive. In fact, you can actually append it to individual operations as needed.
SELECT name WHERE 'greg' = name COLLATE Latin1_GENERAL_CS_AS
I know, you said that you want this to apply throughout the database. But I mention this because in certain hosted environments, you can't control this property, which is set when the database is created.
How about:
ALTER DATABASE database_name COLLATE collation_name
See BOL for a list of collation options and pick the case-sensitive one that best fits your needs (i.e. the one your client is using).
Obviously, it's probably a good idea to make a full backup of your database before you try this. I've never personally tried to use a database with a different collation than the server's default collation, so I don't know of any "gotchas". But if you have good backups and test it in your environment before deploying it to your client, I can't imagine that there's much risk involved.
If you have a DB that has a different collation to the instance default, you can run into problems when you try and join your tables with temporary ones. Temporary tables have to collation of the instance (because they're system objects) so you need to use the COLLATE database_default clause in your joins.
select temp.A, table.B
from #TEMPORARY_TABLE temp inner join table
on temp.X COLLATE database_default = table.Y
This forces the collation of temp.X (in this example) to the collation of the current DB.
You'll have to change the database collation. You'll also need to alter the table and column level collation. I beleive you can find a script out there if you google it.

Resources