I have table with two varchar columns first_name and last_name.
Now I need to convert these columns to nvarchar in order to support UTF-8.
I choose nvarchar datatype in SSMS for these columns and when I try to enter some UTF-8 data, my symbols converts to question marks. For example, if I input йцукен (Ukrainian) it will be converted to ??????.
What is the problem and how to fix it?
Thanks.
When you want to insert nvarchar literals into the database table, you must use the N'..' prefix.
So use
INSERT INTO dbo.YourTable(First_Name)
VALUES(N'йцукен')
so that this string will be treated as a unicode string
If you're not using the N'..' notation, you're really inserting a non-unicode string literal - and this will cause these conversions to ?
Related
I have many of my database columns defined as VARCHAR(255) and now would like to input unicode characters (especially the € sign).
Inserting data doesn´t show the unicode symbols, but if I update my insert statement to use NVARCHAR paramerts it is working. But why? I didn´t change the column definition from VARCHAR to NVARCHAR.
To support these symbols should I just change all paramerts from VARCHAR to NVARCHAR or should I also update the columns definition to NVARCHAR(255)?
Varchar datatype doesn't hold Unicode character. You must convert VARCHAR(255)data type to NVARCHAR(255) inorder to hold Unicode character.
Also update the columns definition to NVARCHAR(255)
use NVARCHAR or change your database character set (export, reinstall with a character set that supports the euro sign like WE8ISO8859P15 or AL32UTF8, and import).
Here's an example of NVARCHAR:
link
Due to this I have a table full of varchar values (e.g., 戦艦å¸å›½) that I need to convert to proper unicode nvarchar values (e.g.,戦艦帝国). How can I do that within T-SQL?
I am trying to insert Ḩāfiz̧ Moghul into a SQL database column with nvarchar(100) set as the datatype.
For some reason it is replacing the first letter with a ?
How do I fix this?
If you want to insert Unicode characters as string literals in a SQL statement, you must prefix the string with a N character:
INSERT INTO dbo.YourTable(UnicodeColum)
VALUES (N'Ḩāfiz̧ Moghul');
If you omit the N prefix, the string will be converted to non-Unicode Varchar before being inserted.
I got a little surprised as I was able to store an Ukrainian string in a varchar column .
My table is:
create table delete_collation
(
text1 varchar(100) collate SQL_Ukrainian_CP1251_CI_AS
)
and using this query I am able to insert:
insert into delete_collation
values(N'використовується для вирішення квитки')
but when I am removing 'N' it is showing ?????? in the select statement.
Is it okay or am I missing something in understanding unicode and non-unicode with collate?
From MSDN:
Prefix Unicode character string constants with the letter N. Without
the N prefix, the string is converted to the default code page of the
database. This default code page may not recognize certain characters.
UPDATE:
Please see a similar questions::
What is the meaning of the prefix N in T-SQL statements?
Cyrillic symbols in SQL code are not correctly after insert
sql server 2012 express do not understand Russian letters
To expand on MegaTron's answer:
Using collate SQL_Ukrainian_CP1251_CI_AS, SQL server is able to store ukrainian characters in a varchar column by using CodePage 1251.
However, when you specify a string without the N prefix, that string will be converted to the default non-unicode codepage before it is sent to the database, and that is why you see ??????.
So it is completely fine to use varchar and collate as you do, but you must always include the N prefix when sending strings to the database, to avoid the intermediate conversion to default (non-ukrainian) codepage.
Below is my code sample.
DECLARE #a TABLE (a VARCHAR(20));
INSERT #a
(a)
VALUES ('中');
SELECT *
FROM #a;
I'm using SQL Server Management Studio to run it. My question is, why I can insert non-ascii characters into VARCHAR column and correctly get it back? As I understand, VARCHAR type is only for ascii characters and the NVARCHAR is for unicode characters. Anyone can help to explain it please? I'm on Windows 7 with SQL Server 2014 developer edition.
The codepage used to store the varchar data varies by DB collation.
https://msdn.microsoft.com/en-us/library/ms189617.aspx
Varchar is 8 bits, so you may have a different collation, or you may have gotten lucky on where your character falls on the code set
You can find the ASCII and Extended ASCII characters below.
ASCII
Extended ASCII
I don't believe '中' is an ASCII character.
www.asciitable.com