SQL Server Collation for non-European languages

SQL Server Collation for non-European languages - sql-server

I want to use a non-European language collation on every column of a specific table. So:
ALTER TABLE dbo.MyTable
ALTER COLUMN MyColumn varchar(10)COLLATE Latin1_General_CI_AS NOT NULL;
But Ç and Ü chars still stands in the cells.
I wish to change them into C and U
How can I do this?

Latin-1 contains Ç (199) and Ü (220). Even if it didn't, it would still contain characters with value 199 and 220 - so you'd just see different tokens instead.
Frankly, if your data is now or could ever be non-ASCII, you'd do well to consider nvarchar(...) instead of varchar(...)

Related

SQL collate, can i not have multiple collates on a column?

Using Microsoft SQL Server Management Studio. I am having an issue with inserting some characters into a database.
I read that you have to add a collate to the column to allow for some characters.
I need some characters from the Czech language. so I added the Czech collate (Czech_100_CI_AS) to the column but then some French characters were removed and can not be entered.
Can I not have multiple collates on a column? this seems would be a weird limitation
I tried this, with a "," but this gives an error on the comma
ALTER TABLE dbo.TestingNames
ALTER COLUMN NameTesting VARCHAR(50) COLLATE Czech_100_CI_AS|, French_CS_AS
Edit:
Ah I misunderstood what collate was meant to do, I didn't realize it was a codepage, i thought it was just an include.
Thanks, Changing it to Nvarchar seemed to have worked :) I actually thought I was using nvarchar /facepalm Thank you for pointing that out to me.

How to idetify the phonetic alphabets in MS SQL database tables

We have inserted more than 100000 records through Import flat file functionality in sql server management studio. It was inserted successfully.
But some of column values contained characters like é and ö .
It got converted into while storing in sql column for all above characters like(ö,é).
Moreover the below SQL statements is not giving any results.
select * from Temp where column1 like '%%'
The data with these characters in the tables are being displayed with a symbol(question mark in a diamond).
Please help as to how can I insert the data keeping the phoentic symobols intact.

Your data contains some characters like é and ö. But when you see in the database, it's stored "?" instead of that, right?
I think, your database does not support all characters. I would recommend to change it to something like this:
character set: utf8
collation: utf8_general_ci
Hope to help, my friend :))

character set: utf8
collation: utf8_general_ci

SELECT
*
FROM TEMP
WHERE SOUNDEX(COLUMN1) LIKE SOUNDEX('A')
strong text

Why can I store an Ukrainian string in a varchar column?

I got a little surprised as I was able to store an Ukrainian string in a varchar column .
My table is:
create table delete_collation
(
text1 varchar(100) collate SQL_Ukrainian_CP1251_CI_AS
)
and using this query I am able to insert:
insert into delete_collation
values(N'використовується для вирішення квитки')
but when I am removing 'N' it is showing ?????? in the select statement.
Is it okay or am I missing something in understanding unicode and non-unicode with collate?

From MSDN:
Prefix Unicode character string constants with the letter N. Without
the N prefix, the string is converted to the default code page of the
database. This default code page may not recognize certain characters.
UPDATE:
Please see a similar questions::
What is the meaning of the prefix N in T-SQL statements?
Cyrillic symbols in SQL code are not correctly after insert
sql server 2012 express do not understand Russian letters

To expand on MegaTron's answer:
Using collate SQL_Ukrainian_CP1251_CI_AS, SQL server is able to store ukrainian characters in a varchar column by using CodePage 1251.
However, when you specify a string without the N prefix, that string will be converted to the default non-unicode codepage before it is sent to the database, and that is why you see ??????.
So it is completely fine to use varchar and collate as you do, but you must always include the N prefix when sending strings to the database, to avoid the intermediate conversion to default (non-ukrainian) codepage.

SQL Server 6.5 deadlocks with Spanish letters ú and ü

We're running SQL 6.5 though ADO and we have the oddest problem.
This sentence will start generating deadlocks
insert clinical_notes ( NOTE_ID, CLIENT, MBR_ID, EPISODE, NOTE_DATE_TIME,
NOTE_TEXT, DEI, CARE_MGR, RELATED_EVT_ID, SERIES, EAP_CASE, TRIAGE, CATEGORY,
APPOINTMENT, PROVIDER_ID, PROVIDER_NAME )
VALUES ( 'NTPR3178042', 'HUMANA/PR', '999999999_001', 'EPPR915347',
'03-28-2011 11:25', 'We use á, é, í, ó, ú and ü (this is the least one we
use, but there''s a few words with it, like the city: Mayagüez).', 'APK', 'APK',
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL )
The trigger are the characters ú and ü. If they are in the NOTE_TEXT column.
NOTE_TEXT is a text column.
There are indexes on
UNC_not_id
NT_CT_MBR_NDX
NT_REL_EVT_NDX
NT_SERIES_NDX
idx_clinical_notes_date_time
nt_ep_idx
NOTE_ID is the primary key.
What happens is after we issue this statement, if we issue an identical one, but with a new NOTE_ID value, we receive the deadlock.
As mentioned, this only happens when ú or ü is in NOTE_TEXT.
This is a test server and there is generally only one session accessing this table when the error occurs.
I'm sure it has something to so with character sets and such, but for the life of me I can't work it out.

Is the column (var)char-based or n(var)char-based? Are the values using unicode above 255 or are they ascii 255 or below (250 and 252)?
Try changing the column to a binary collation, just to see if that helps (it may shed light on the issue). I do NOT know if this works in SQL 2000 (though I can check on Monday), but you can try this to find what collations are available on your server:
SELECT * FROM ::fn_helpcollations()
Latin General BIN should be in there somewhere.
Assuming you find a collation to try, you change the collation like so:
ALTER TABLE TableName ALTER COLUMN ColumnName varchar(8000) NOT NULL COLLATE Collation_Name_Here
Script out your table to learn the collation it's using now so you can set it back if that doesn't work or causes problems. Or use a backup. :)
One additional note is that if you're using unicode you do need an N before literal strings, for example:
SELECT N'String'

How to convert chinese characters to AL16UTF16 or WE8ISO8859P1?

I have inserted into database some chinese characters. (Column name is NAME, data type is VARCHAR2)
My project name is: 中文版测试 and I need to select project by this name.
But.
In oracle database are inserted 中文版测试 with name : ÖÐÎÄ°æ²âÊÔ (If I understand right my database has a set with the name WE8ISO8859P1)
I want to convert this characters from database (ÖÐÎÄ°æ²âÊÔ) to chinese characters (中文版测试) or to a same values to compare.
I try this:
select DIRNAME from MILLENNIUM.PROJECTINFO where UPPER(convert(NAME, 'AL32UTF8', 'we8iso8859p1')) = UPPER(convert('中文版测试', 'WE8MSWIN1252', 'AL32UTF8'));
I need to compare values from oracle with the name of the project.
Oracle settings:
NLS_CHARACTERSET WE8ISO8859P1 0
NLS_NCHAR_CHARACTERSET AL16UTF16 0

AS Michael O'Neill already pointed out it is not possible to store Chinese characters in character set WE8ISO8859P1. All unsupported characters are automatically replaced by ¿ (or any other place holder)
BTW, WE8ISO8859P1 is different to WE8MSWIN1252 (see What is the exact difference between Windows-1252(1/3/4) and ISO-8859-1?), so your conversion does not work anyway.
Solution is to change data type of column NAME to NVARCHAR2 or migrate your database to UTF-8, see Character Set Migration and Database Migration Assistant for Unicode Guide. In any case you should consider your data being lost, resp. corrupted.
However, in case your client application was configured wrongly then in certain circumstances it is possible to insert unsupported characters, see If we have US7ASCII characterset why does it let us store non-ascii characters?.
In such case you can try to repair your data as this:
ALTER TABLE PROJECTINFO ADD NAME_CN NVARCHAR2(100);
UPDATE PROJECTINFO SET NAME_CN = UTL_I18N.RAW_TO_NCHAR(UTL_I18N.STRING_TO_RAW(NAME), 'ZHS16CGB231280');
ALTER TABLE PROJECTINFO DROP COLUMN NAME;
ALTER TABLE PROJECTINFO RENAME COLUMN NAME_CN TO NAME;
select DIRNAME from MILLENNIUM.PROJECTINFO where NAME = '中文版测试';
but it may not work for all of your data.
Hence a (not recommended) workaround for your problem could be
select DIRNAME
from MILLENNIUM.PROJECTINFO
where UTL_I18N.RAW_TO_NCHAR(UTL_I18N.STRING_TO_RAW(NAME), 'ZHS16CGB231280') = '中文版测试';

You cannot take Chinese characters, insert them into a column that is bound by the WE8ISO8859P1 character set and then select them ever again as Chinese characters. You have lost information on your insert. That lost information cannot be reconstituted.
In your case, the NAME column if it were defined as NVARCHAR2, you could do a AL16UTF16 to AL16UTF16 comparison in a subsequent SELECT. Or, even better, not need to convert and compare with AL16UTF16 at all if your client tool is up to the task.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

SQL Server Collation for non-European languages - sql-server

I want to use a non-European language collation on every column of a specific table. So: ALTER TABLE dbo.MyTable ALTER COLUMN MyColumn varchar(10)COLLATE Latin1_General_CI_AS NOT NULL; But Ç and Ü chars still stands in the cells. I wish to change them into C and U How can I do this?

Latin-1 contains Ç (199) and Ü (220). Even if it didn't, it would still contain characters with value 199 and 220 - so you'd just see different tokens instead. Frankly, if your data is now or could ever be non-ASCII, you'd do well to consider nvarchar(...) instead of varchar(...)

Related

SQL collate, can i not have multiple collates on a column?

How to idetify the phonetic alphabets in MS SQL database tables

Why can I store an Ukrainian string in a varchar column?

SQL Server 6.5 deadlocks with Spanish letters ú and ü

How to convert chinese characters to AL16UTF16 or WE8ISO8859P1?

Categories

Resources