What collation should I use for amharic language? - sql-server

I am using SQL Server 2014. I want to use amharic language in my database. The default collation for the database was Latin1_General_CP1_CI_AS. I changed it to Latin1_General_CI_AS. Both doesn't diplay amharic characters. They show the characters when typing but when comitted they are changed to question marks.
What collation should I use or what am I missing?

I assume your problem might not be in Collation.
Try to use UNICODE data types such as NCHAR and NVARCHAR and you'll see your saved characters.
Collation is only needed for sorting and comparing. Look through the list of collations and choose the most appropriate one.
Originally, you did not tell you are using Full-text search. That requires you to use key word LANGUAGE with name of your language. However, there are only 53 supported languages (see in sys.fulltext_languages) and amharic isn't there.
You have only an option to recreate your Full-text catalog with the neutral word breaker and then re-populate it. Then at least it will recognize words by spaces and punctuation marks.
See more details: https://msdn.microsoft.com/en-us/library/ms142509.aspx?f=255&MSPPError=-2147217396

You can follow the steps on MSDN:
Default Collations in SQL Server Setup
In Control Panel, find the Windows system locale name on the Advanced
tab in Regional and Language Options. In Windows Vista, use the
Formats" tab. The following table shows the corresponding collation
designator to match collation settings with an existing Windows
locale:
So you have to change the setting using the Control panel so as to see the results.

A simple google search brougth me to this list of collations, most of them started with SQL Server 2008R2.
Your collation should be Latin1_General_100_
Some more hints:
The SQL Server has a default collation, which is the standard collation for all new databases and - very important!!! - the standard collation for your temp-database.
Best would be - if this works for you - to install the SQL Server with the appropriate collation.
SQL Server allows to define a default collation on database level too. But this can lead into deep troubles, if you work with CREATE TABLE #table and use WHERE,ORDER BY,GROUP BY or JOIN (any comparison...) using character fields...
Last but not least you can define a collation on statement level too. This means a lot ot typing and rather difficult to read code, but offers the best control...
It is a different issue how the output of a query is displayed. This is an issue of the output window (or any other program you are using to display the values of your queries...). In this case it might be necessary to use the appropriate encoding and character set. This depends on the tools you are working with...

Related

Why does SSDT Schema compare showing collation as a difference?

I have a Visual Studio Database project (SQL Server) with tables, stored procedures etc. The tables have collation defined ex:
CREATE TABLE [dbo].[TestTable]
(
[TestColumn] [varchar] (3) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
);
The database default collation is also SQL_Latin1_General_CP1_CI_AS.
I use sqlpackage to publish and the ScriptDatabaseCollation set to True.
When I modify the table from any direction (like adding a new column), and use the SSDT compare tool, it shows the collation as a different, even though the "Ignore collation" is set to True:
Also, another interesting is that, when I click on the generate script, it won't contain any collation modifications, just the new column.
It's even worst when I try to compare from the other direction (update the DB directly and use compare from DB to local project), because it updates my file and removes the collation.
Sytem information:
SSDT Version 17.0.62204.01010
MSSQL Server Express 15.0.4153.1
Visual Studio Professional 2022 17.2.2
Does anybody know how can I solve this problem?
I can only presume that SSDT is thus trying to remove "excessive" DML which it thinks is unnecessary. Since your column's collation is the same as that of the project, repeating it again doesn't really make much sense (at least from SSDT's point of view).
You probably will appreciate this behaviour if / when you will have multiple instances of your database with different default collations. Speaking of which, I hope you know what you are doing, choosing a very old, problematic SQL collation as a default for your system.
Having said that, SSDT doesn't always remove collate clauses from your DML. If you specify a column collation which is different from the project's default one, it won't disappear after schema comparison (assuming both source and target have the same one). In one of my recent projects, for example, I needed a couple of columns to be case-sensitive, so I set them to Latin1_General_100_CS_AS in SSDT. These clauses didn't go anywhere after several months of development work.
If, for some unknown reason, it is absolutely paramount for you to keep these collate clauses in your code, you may set the project's default collation to something else. This will prevent SSDT from cleaning up the noise. However, you need to be careful with schema comparison and DACPAC deployment. In the former, you have the following options:
"Compare using target collation" (cleared by default),
"Ignore column collation" (cleared by default),
"Verify collation compatibility" (set by default).
(Not sure about the latter, as I never used it.)
However, going to the Schema Compare settings dialog every time you need to compare schemas will soon become too tedious to bear. I would recommend to just agree with SSDT and remove the stuff you don't really need.

SQL Server collation -- closest to utf-8?

I can't seem to find a way to set the default collation of a database to utf(ish). For example:
For example, in mysql the default utf collation is called utf8_general_ci. Is there something similar for SQL server for this? Also, what does it use Latin1 as default?
According to https://learn.microsoft.com/en-us/sql/relational-databases/collations/collation-and-unicode-support?view=sql-server-ver15#utf8, you add "_UTF8" to the collation name to enable use of UTF8. (SQL Server 2019 is required.) The example given is to change LATIN1_GENERAL_100_CI_AS_SC to LATIN1_GENERAL_100_CI_AS_SC_UTF8.
If you will be migrating an existing database from a older version, I believe extra care is required to insure collation conversion is handled properly. There can be side effects from the change in sorting. Also, existing table definitions will use their original collation. This might be an issue if creating new tables that will use the new collation by default.

SonarQube: Is the Collation for the Database or the Instance?

According to the SonarQube documentation "Installing the Server" (https://docs.sonarqube.org/display/sonar/installing+the+server), for a Microsoft SQL Server host, "collation MUST be case-sensitive (CS) and accent-sensitive (AS)."
The documentation is not clear if the collation must be set:
for the SQL Server instance, or
the database
If the collation for the SQL Server (and specifically for tempdb) is "accent insensitive" and the database collation is "accent sensitive", does SonarQube accommodate this configuration?
If the collation for the SQL Server (and specifically for tempdb) is "accent insensitive" and the database collation is "accent sensitive", does SonarQube accommodate this configuration?
Since the documentation is ambiguous (they might not use SQL Server enough to know the different levels where Collation can be set), the only two ways to get the answer here are:
Contact their community: https://www.sonarqube.org/community/feedback/. This is the best choice.
Install it on an Instance that has an accent insensitive default Collation and test it out. No reason not to try this.
Whether or not SonarQube handles this properly depends on how it was coded. They could be JOINing on string columns in temporary tables and any difference in Collation between the Database and Instance could potentially cause an error, but only if they are not specifically declaring the Collation when creating the temp tables.
Also, it is possible that their app needs the accent sensitivity because they have some variables names and/or cursor names and/or (less likely) GOTO label names that might equate under accent insensitivity that should otherwise be seen as different. Instance-level Collation controls these areas and would hence affect the name resolution of those items. Of course, this would be easy to test for since declaring two variables that are considered different names under accent sensitivity will cause a parse error if close enough to be considered the same under accent insensitivity. Still, contact their community.

Does sqlserver collation mean column names must be correct case? And how to deal with that

In SQL Server (2000 or 2005) is it possible to set the database or server collation so that identifier names (tables, columns, etc) need to be in the correct case? If so, is that true of all case-sensitive collations or is it a separate setting? (I've always thought of case-sensitivity applying to data, not to names of objects).
Presumably this would break an application if its stored procs and queries weren't written with consistent case? Is there a way to deal with this without having to ensure all queries use the correct case, such as setting the collation of a database connection?
I'm looking at this from the point of view of having an existing application which probably has inconsistently cased sql code in it, and I'm wanting to be able to run it against databases with different collations. What settings would I need or what set of database/server collations could I not use the application with?
The collation is what determines if your queries will be case insensitive. So the only way to ensure that your schema will work against multiple environments is to have your queries be case sensitive. If your queries are not consistent, then your collation MUST be case insensitive otherwise it will not work.
http://msdn.microsoft.com/en-us/library/aa174903(SQL.80).aspx
One thing to note is that once you've set up your SQL Server environment with a certain collation, you CANNOT change it without creating a NEW SQL Server instance. So Case-Insensitive is usually the way to go. And then strive to have consistency in your queries.
Once a collation is set it applies to both data and metadata, I believe.
Collation is set in earlier versions of SQL Server, but in 2005 and beyond, you can change it by object, as they are created.
The database default collation determines whether objects within the database are treated in a case-sensitive way in queries - this applies to all object name: tables, columns, etc.
If your application code comes from a case-insensitive collation database, it may not run on a case-sensitive collation database if a object is misreferenced (you would get a message when you attempted to run the statement or create the stored procedure, or in a stored-proc architecture, you would catch all these pretty quickly unless you had a significant amount of dynamic SQL).
Remember, that even if your code runs, individual columns can be set with collations which differ from the database, so it's always possible that with a differing collation, your code will behave unexpectedly (for instance, GROUP BY behaves differently).
You can set collation for each object, and set a default for the database and server as well.
How to deal with it? You need to enforce standards here. You can easily get yourself tangled up with different people write with different case.
The collation also applies to data so "bob" != "Bob"

SQL Server Collation Conflict

Transferring data from one SQL server to another but when Schema is compared and syncronised the following error is received. We are using redgate SQL compare to complete.
Cannot resolve collation conflict for equal to operation
Base SQL server is SQL_Latin1_General_CP1_CI_AS and the destination server is Latin1_General_CI_AS
SQL Compare has an option to ignore collations. Look under the tab "options" in your compare project configuration.
is you problem with the SQL Compare utlity, or a worry that different server collations will lead to problems?
You could change the collation of the destination server to match the Base server
If that is not possible, then make the Collation of the databases on each server match, and then your only real problem is likely to be any temporary tables which you create (they will have a default collation matching the server / TEMPDB), and so long as you explicitly create the temporary table (i.e. don't create it using SELECT * INTO #TEMP FROM MyTable) and explicitly assign a collation to any varchar/text columns you should be OK
The way I overcome this is to generate the scripts via SQL Compare and then strip out (or replace) the Collation specific code. This is relatively fast and easy to do, and finally I manually apply the scripts to the destination server/ database.
Sounds like the collation settings for the server are different.
How are you transferring the data, do you perform a database restore on your new platform?
Either way, you need to ensure that the same collation is used on your new environment as is currently in place in your source environment.
Hope this makes sense, let me know if you need further assistance.
"Ignore collations" is definitely not going to work, for the reason stated above. The problem happens when migrating objects like views and stored procedures that use JOIN clauses on text fields that have differing collations.
If someone changes the default collation on the server and the column on the other side of the JOIN uses a specific collation, you've caused this issue. And it would happen in SQL Compare as well as if you just manually scripted the object in SSMS and moved it yourself.
There are two roads to fixing it - you could specify a COLLATE clause on the join and explicitly state the collation you want to use, or you could change the destination database default collation to match the source.
I'm afraid there is no SQL Compare "magic bullet" to solve this.

Resources