Ignoring accents in SQL Server using LINQ to SQL

Ignoring accents in SQL Server using LINQ to SQL - sql-server

How can I ignore accents (like ´, `, ~) in queries made to a SQL Server database using LINQ to SQL?
UPDATE:
Still haven't figured out how to do it in LINQ (or even if it's possible) but I managed to change the database to solve this issue.
Just had to change the collation on the fields I wanted to search on. The collation I had was:
SQL_Latin1_General_CP1_CI_AS
The CI stans for "Case Insensitive" and AS for "Accent Sensitive". Just had to change the AS to AI to make it "Accent Insensitive".
The SQL statement is this:
ALTER TABLE table_name ALTER COLUMN column_name column_type COLLATE collation_type

In SQL queries (Sql Server 2000+, as I recall), you do this by doing something like select MyString, MyId from MyTable where MyString collate Latin1_General_CI_AI ='aaaa'.
I'm not sure if this is possible in Linq, but someone more cozy with Linq can probably translate.
If you are ok with sorting and select/where queries ALWAYS ignoring accents, you can alter the table to specify the same collation on the field(s) with which you are concerned.

See the following answer:
LINQ Where Ignore Accentuation and Case
Basically you need to alter the field type in SQL Server, e.g.
ALTER TABLE People ALTER COLUMN Name [varchar](100) COLLATE SQL_Latin1_General_CP1_CI_AI
There does not seem to be a way to do this using LINQ, apart from calling a custom method to remove diacritics (which would not be performant).

LINQ to SQL doesn't have any specific functionality for setting the collation used for a query and so it will always use the database default.

It seems that there is a way to ignore the collation differences in Linq to SQL by using t-sql functions:
CREATE FUNCTION [dbo].[func_ConcatWithoutCollation]
(
#param1 varchar(2000),
#param2 varchar(2000)
)
RETURNS varchar(4000)
AS
BEGIN
IF (#param1 IS NULL) SET #param1 = ''
IF (#param2 IS NULL) SET #param2 = ''
RETURN #param1 COLLATE Latin1_General_CS_AS + #param2 COLLATE Latin1_General_CS_AS
END
to get this function in linq to sql, there is a switch for SqlMetal: /functions.
Example:
"%ProgramFiles%\Microsoft SDKs\Windows\v7.0A\Bin\SqlMetal.exe" /server:. /database:NameOfDatabase /pluralize /code:ContextGenerated.cs /sprocs /views /functions
Use this function in Linq to sql like this:
from s in context.Services
where context.Func_ConcatWithoutCollation(s.Description, s.Email) == "whatever"
select s
It helped me, maybe somebody finds this useful too.

A solution could be create an SQL Function to remove the diacritics, by applying to the input string the collation SQL_Latin1_General_CP1253_CI_AI, like so:
CREATE FUNCTION [dbo].[RemoveDiacritics] (
#input varchar(max)
) RETURNS varchar(max)
AS BEGIN
DECLARE #result VARCHAR(max);
select #result = #input collate SQL_Latin1_General_CP1253_CI_AI
return #result
END
Then you could add it in the DB context (in this case ApplicationDbContext) by mapping it with the attribute DbFunction:
public class ApplicationDbContext : IdentityDbContext<CustomIdentityUser>
{
[DbFunction("RemoveDiacritics", "dbo")]
public static string RemoveDiacritics(string input)
{
throw new NotImplementedException("This method can only be used with LINQ.");
}
public ApplicationDbContext(DbContextOptions<ApplicationDbContext> options)
: base(options)
{
}
}
And finally use it in LINQ query, for example (linq-to-entities):
var query = await db.Users.Where(a => ApplicationDbContext.RemoveDiacritics(a.Name).Contains(ApplicationDbContext.RemoveDiacritics(filter))).tolListAsync();

Related

Can you set collation of T-SQL Variable?

I've searched high and low but can't find an answer, can you set the collation of a variable? According to the MS documentation, it seems that it's only possible on SQL Azure:
-- Syntax for Azure SQL Data Warehouse and Parallel Data Warehouse
DECLARE
{{ #local_variable [AS] data_type } [ =value [ COLLATE ] ] } [,...n]
Currently I have to do this:
DECLARE #Test nvarchar(10) = N'Crud';
IF ( #Test = N'Crud' COLLATE Latin1_General_CS_AI )
Print N'Crud';
IF ( #Test = N'cRud' COLLATE Latin1_General_CS_AI )
Print N'cRud';
IF ( #Test = N'crUd' COLLATE Latin1_General_CS_AI )
Print N'crUd';
IF ( #Test = N'cruD' COLLATE Latin1_General_CS_AI )
Print N'cruD';
When what I'd like to do is this:
DECLARE #Test nvarchar(10) = N'Crud' COLLATE Latin1_General_CS_AI;
IF ( #Test = N'Crud' )
Print N'Crud';
IF ( #Test = N'cRud' )
Print N'cRud';
IF ( #Test = N'crUd' )
Print N'crUd';
IF ( #Test = N'cruD' )
Print N'cruD';
I'm guessing the answer is no but I wanted to confirm and at the very least, someone else ever needing this info will get a definitive answer.
Much appreciated.

Well, you're guessing correctly.
In most SQL Server systems, (meaning, not including Azure SQL Data Warehouse and Parallel Data Warehouse) A collation can be set on four levels:
The default collation of the SQL Server instance:
The server collation acts as the default collation for all system databases that are installed with the instance of SQL Server, and also any newly created user databases.
The default collation of a specific database:
You can use the COLLATE clause of the CREATE DATABASE or ALTER DATABASE statement to specify the default collation of the database. You can also specify a collation when you create a database using SQL Server Management Studio. If you do not specify a collation, the database is assigned the default collation of the instance of SQL Server.
You can set a collation for a table's column:
You can specify collations for each character string column using the COLLATE clause of the CREATE TABLE or ALTER TABLE statement. You can also specify a collation when you create a table using SQL Server Management Studio. If you do not specify a collation, the column is assigned the default collation of the database.
You can set a collation for a specific expression using the Collate clause:
You can use the COLLATE clause to apply a character expression to a certain collation. Character literals and variables are assigned the default collation of the current database. Column references are assigned the definition collation of the column.
So yes, with the exception of Azure SQL Data Warehouse and Parallel Data Warehouse, you can't set a collation on a local scalar variable.

collation as function argument

I have a database that is supposed to store data in any language, there is going to be a column that tells me which locale it is, so i can't rely on database collation and will have to specify collation at runtime in queries.
I also have the problem that i want to use EF for dataaccess, as we know using EF one cannot specify collation at runtime. I am thinking about creating a sql function that takes collation as argument and apply that function in all of the Linq Queries.
but this fails
CREATE FUNCTION fn_Compare
(
#TextValue nvarchar(max),
#Culture varchar(10)
)
RETURNS nvarchar(max)
AS
BEGIN
RETURN #TextValue COLLATE #Culture
END
GO
does anyone know if this can be done ?

You cannot do this. The collation returned by the function needs to be consistent across all the return values. For instance, the following generates an error:
create function testfn (#test varchar(100), #i int)
returns varchar(100)
as
begin
return(case when #i = 0 then #test collate SQL_Latin1_General_CP1_CS_AS
else #test collate SQL_Latin1_General_CP1_CI_AS
end)
end;
The error is due to a collation conflict.
What you can do is use:
alter database collate <whatever>
Or, alternatively, create a new working database with the collation you want.

Store such characters in SQL Server 2008 R2

I'm storing encrypted passwords in the database, It worked perfect so far on MachineA. Now that I moved to MachineB it seems like the results gets corrupted in the table.
For example: ù9qÆæ\2 Ý-³Å¼]ó will change to ?9q??\2 ?-³?¼]? in the table.
That's the query I use:
ALTER PROC [Employees].[pRegister](#UserName NVARCHAR(50),#Password VARCHAR(150))
AS
BEGIN
DECLARE #Id UNIQUEIDENTIFIER
SET #Id = NEWID()
SET #password = HashBytes('MD5', #password + CONVERT(VARCHAR(50),#Id))
SELECT #Password
INSERT INTO Employees.Registry (Id,[Name],[Password]) VALUES (#Id, #UserName,#Password)
END
Collation: SQL_Latin1_General_CP1_CI_AS
ProductVersion: 10.50.1600.1
Thanks

You are mixing 2 datatypes:
password need to be nvarchar to support non-Western European characters
literals need N prefix
Demo:
DECLARE #pwdgood nvarchar(150), #pwdbad varchar(150)
SET #pwdgood = N'ù9qÆæ\2 Ý-³Å¼]ó'
SET #pwdbad = N'?9q??\2 ?-³?¼]?'
SELECT #pwdgood, #pwdbad
HashBytes gives varbinary(8000) so you need this in the table
Note: I'd also consider salting the stored password with something other than ID column for that row

If you want to store such characters, you need to:
use NVARCHAR as the datatype for your columns and parameters (#Password isn't NVARCHAR and the CAST you're using to assign the password in the database table isn't using NVARCHAR either, in your sample ...)
use the N'....' syntax for indicating Unicode string literals
With those two in place, you should absolutely be able to store and retrieve any valid Unicode character

Exception executing a stored procedure with CASE switching from C# (T-SQL)

I have a NVARCHAR(max) column in a table and a stored procedure that would update this column as well as any other column in the table using CASE switching:
CREATE PROCEDURE updateTable
#columnName sysname,
#value nvarchar(max)
AS
UPDATE [dbo].[TestTable]
SET
BigNvarcharValue = CASE #columnName WHEN 'BigNvarcharValue' THEN #value ELSE BigNvarcharValue END,
TableName = CASE #columnName WHEN 'TableName' THEN #value ELSE TableName END
All is good if I execute this procedure from SQL Management Studio with
EXEC [dbo].[updateTable]
#columnName = 'BigNvarcharValue',
#value = N'SOME BIG 80Kb value'
I can also update TableName from C# code using the same stored procedure, but when it comes to updating this BigNvarcharValue from C#, it fails with SQLException that "String or binary data would be truncated". Now, I figured it has something to do with CASE in this stored procedure, because when I break it to a simpler stored procedure, everything works fine:
CREATE PROCEDURE updateTable
#columnName sysname,
#value nvarchar(max)
AS
UPDATE [dbo].[TestTable]
SET BigNvarcharValue=#value
I read a bunch of forum posts that describe this problem of trying to insert a bigger value into NVARCHAR column that would cause this exception, but it doesnt seem to apply.
I'm fairly new to T-SQL, so are there any limitations of CASE that I dont know of?
P.S. BigNvarcharValue is NVARCHAR(MAX) and TableName is NVARCHAR(50)

What are the data types of the columns you're dealing with? Because I've reproduced the error by attempting to insert a value that is allowed by NVARCHAR(max) into a column that is VARCHAR(50).
To reiterate - NVARCHAR(max) is allowing you to specify a value that is longer than the stated data type, which is why you get the error about truncation.

The error says it by itself, "String or binary data would be truncated". This means that you seem to insert a larger value than what the nvarchar(max) can handle.
SSMS 2008 has some debugging features allowing to set breakpoints, etc.
I think you might wish to take an eye out to the System.String maximum capacity either. This is only a matter of length, somewhere.

With your exact same stored procedure and the table you described I ran the following code
class Program
{
static void Main(string[] args)
{
using(SqlConnection cnn = new SqlConnection(#"Server=.;Database=test;Trusted_Connection=True;"))
{
cnn.Open();
SqlCommand cmd = new SqlCommand("updateTable",cnn);
cmd.CommandType = System.Data.CommandType.StoredProcedure;
cmd.Parameters.Add(new SqlParameter("#columnName",
System.Data.SqlDbType.NVarChar, 128));
cmd.Parameters["#columnName"].Value = "BigNvarcharValue";
cmd.Parameters.Add(new SqlParameter("#value",
System.Data.SqlDbType.NVarChar, -1));
cmd.Parameters["#value"].Value = new string('T', 80000);
cmd.ExecuteNonQuery();
}
}
}
It worked fine. I would inspect the command text and the parameter collection (name and value) and verify every is as you think it is.

Thanks everyone for the responses. I ended up separating update of the big column to an individual procedure, which solved the problem. Im sure the culprit was with the CASE statement.

Is there a way in SQL Server to uniquely identify a database?

Is there any way to uniquely identify a database?
If we were to copy a database to another machine, this instance is assumed to be different. I checked on master tables, but could not identify any information that can identify this.

service_broker_guid in sys.databases comes pretty close to what you ask. It is a uniqueidentfier generated when the database is created and is preserved as the database is moved around (detach and attach, backup and restored, server rename etc). It can be explicitly changed with ALTER DATABASE ... SET NEW_BROKER;.

You could make a table in it with a unique name, and simply do a query on that. It's a bit of a hack, sure, but it'd work...

You could put the information in an extended property associated with the database itself:
USE AdventureWorks2008R2;
GO
EXEC sys.sp_addextendedproperty
#name = N'MS_DescriptionExample',
#value = N'AdventureWorks2008R2 Sample OLTP Database';
GO
http://msdn.microsoft.com/en-us/library/ms190243.aspx
In your case, I would use something like this:
EXEC sys.sp_addextendedproperty
#name = N'UniqueID',
#value = N'10156435463';
select objname, [name], [value]
from fn_listextendedproperty (null, null, null, null, null, null, null)

Create a scalar function that returns an ID/Version number:
create function fnGetThisDBID() returns varchar(32) as begin
return ('v1.1,origin=server1')
end
select 'version is: ' + dbo.fnGetThisDBID()

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight