Kurdish Sorani Letters sql server - sql-server

I am trying to create a database containing Kurdish Sorani Letters.
My Database fields has to be varchar cause of project is started that vay.
First I create database with Arabic_CI_AS
I can store all arabic letters on varchar fields but when it comes to kurdish letters for example
ڕۆ these special letters are show like ?? on the table after entering data, I think my collation is wrong. Have anybody got and idea for collation ?

With that collation, no, you need to use nvarchar and always prefix such strings with the N prefix:
CREATE TABLE dbo.floo
(
UseNPrefix bit,
a varchar(32) collate Arabic_CI_AS,
b nvarchar(32) collate Arabic_CI_AS
);
INSERT dbo.floo(UseNPrefix,a,b) VALUES(0,'ڕۆ','ڕۆ');
INSERT dbo.floo(UseNPrefix,a,b) VALUES(1,N'ڕۆ',N'ڕۆ');
SELECT * FROM dbo.floo;
Output:
UseNPrefix
a
b
False
??
??
True
??
ڕۆ
Example db<>fiddle
In SQL Server 2019, you can use a different SC + UTF-8 collation with varchar, but you will still need to prefix string literals with N to prevent data from being lost:
CREATE TABLE dbo.floo
(
UseNPrefix bit,
a varchar(32) collate Arabic_100_CI_AS_KS_SC_UTF8,
b nvarchar(32) collate Arabic_100_CI_AS_KS_SC_UTF8
);
INSERT dbo.floo(UseNPrefix,a,b) VALUES(0,'ڕۆ','ڕۆ');
INSERT dbo.floo(UseNPrefix,a,b) VALUES(1,N'ڕۆ',N'ڕۆ');
SELECT * FROM dbo.floo;
Output:
UseNPrefix
a
b
False
??
??
True
ڕۆ
ڕۆ
Example db<>fiddle
Basically, even if you are on SQL Server 2019, your requirements of "I need to store Sorani" and "I can't change the table" are incompatible. You will need to either change the data type of the column or at least change the collation, and you will need to adjust any code that expects to pass this data to SQL Server without an N prefix on strings.

Related

Why can I store an Ukrainian string in a varchar column?

I got a little surprised as I was able to store an Ukrainian string in a varchar column .
My table is:
create table delete_collation
(
text1 varchar(100) collate SQL_Ukrainian_CP1251_CI_AS
)
and using this query I am able to insert:
insert into delete_collation
values(N'використовується для вирішення квитки')
but when I am removing 'N' it is showing ?????? in the select statement.
Is it okay or am I missing something in understanding unicode and non-unicode with collate?
From MSDN:
Prefix Unicode character string constants with the letter N. Without
the N prefix, the string is converted to the default code page of the
database. This default code page may not recognize certain characters.
UPDATE:
Please see a similar questions::
What is the meaning of the prefix N in T-SQL statements?
Cyrillic symbols in SQL code are not correctly after insert
sql server 2012 express do not understand Russian letters
To expand on MegaTron's answer:
Using collate SQL_Ukrainian_CP1251_CI_AS, SQL server is able to store ukrainian characters in a varchar column by using CodePage 1251.
However, when you specify a string without the N prefix, that string will be converted to the default non-unicode codepage before it is sent to the database, and that is why you see ??????.
So it is completely fine to use varchar and collate as you do, but you must always include the N prefix when sending strings to the database, to avoid the intermediate conversion to default (non-ukrainian) codepage.

'LIKE' keyword is not working in sql server 2008

I have bulk-data in SQL-Server table. One of the fields contains following data :
'(اے انسان!) کیا تو نہیں جانتا)'
Tried:
SELECT * from Ayyat where Data like '%انسان%' ;
but it is showing no-result.
Plese use N before string if language is not a english:
SELECT * from Ayyat where Data like N'%انسان%' ;
If you're storing urdu, arabic or other language except english in your database first you should convert your database into another format and then also table.
first you've to convert your database charset and also collation alter your database
ALTER DATABASE database_name CHARACTER SET utf8 COLLATE utf8_general_ci
then convert your table
ALTER TABLE table_name CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci
after this execute normal your query
SELECT * FROM table_name WHERE Datas LIKE '%انسان%'
Note: if you not convert your database and table other languages and special characters will be changed into question marks.

Inserting arabic characters into sql server database from grails

I have a grails application that is intended to insert arabic characters from interface to sql server database. First I did it with field type varchar but it appears ???????. Then I found that the field type should be nvarchar but with, application simply fails to insert any row by giving the following error,
java.sql.SQLException: Cannot insert the value NULL into column 'id', table 'smsCampaigner.dbo.message'; column does not allow nulls. INSERT fails.
I know that this is constraints error that id cannot be null but this was not the case before changing the field type. Now I want to know whether we can specify fro config.groovy which encoding should be used in database ? any other type of help will be appreciated.
here you go..
when you create the domain class use String type for that column and that's it. you could insert any character it will accept.
for example TestDomain.groovy
class TestDomain {
String column1
}
In mysql you can set up the collation of the database/table/column to use utf8_unicode_ci and if I am not wrong that is valid to save arabic chars. Not sure how to do it with SQL Server but it should be something similar.
Change the collation of the database and tables and columns to Arabic_CI_AS
In Sql Server management studio (SSMS), Execute :
ALTER DATABASE DB06 COLLATE Arabic_CI_AS
where DB06 is the name of the database.
right-click on table, then click design, then select the desired column, in its properties, choose the collation to Arabic_CI_AS
by the way, the error :
Cannot insert the value NULL into column
is not related to Arabic, this is because you are trying to insert NULL in the column that does not allow NULL.

2 different collations conflict when merging tables with Sql Server?

I have DB1 which has a Hebrew collation
I also have DB2 which has latin general collation.
I was asked to merge a table (write a query) between DB1.dbo.tbl1 and DB2.dbo.tbl2
I could write in the wuqery
insert into ...SELECT Col1 COLLATE Latin1_General_CI_AS...
But I'm sick of doing it.
I want to make both dbs/tables to the same collation so I don't have to write every time COLLATE...
The question is -
Should I convert latin->hebrew or Hebrew->latin ?
we need to store everything from everything. ( and all our text column are nvarachr(x))
And if so , How do I do it.
If you are using Unicode data types in resulted database - nvarchar(x), then you are to omit COLLATE in INSERT. SQL Server will convert data from your source collation to Unicode automatically. So you should not convert anything if you are inserting to nvarchar column.

text encodings in .net, sql server processing

I have an application that gets terms from a DB to run as a list of string terms. The DB table was set up with nvarchar for that column to include all foreign characters. Now in some cases where characters like ä will come through clearly when getting the terms from the DB and even show that way in the table.
When importing japanese or arabic characters, all I see are ????????.
Now I have tried converting it using different methods, first converting it into utf8 encoding and then back and also secondly using the httputility.htmlencode which works perfectly when it is these characters but then converts quotes and other stuff which I dont need it to do.
Now I accused the db designer that he needs to do something on his part but am I wrong in that the DB should display all these characters and make it easy to just query it and add to my ssearch list. If not is there a consistent way of getting all international characters to display correctly in SQL and VB.net
I know when I have read from text files I just used the Microsoft.visualbasic.textfieldparser reader tool with encoding set to utf8 and this would not be an issue.
If the database field is nvarchar, then it will store data correctly. As you have seen.
Somewhere before it gets to the database, the data is being lost or changed to varchar: stored procedure, parameters, file encoding, ODBC translation etc.
DECLARE #foo nvarchar(100), #foo2 varchar(100)
--with arabic and japanese and proper N literal
SELECT #foo = N'العربي 日本語', #foo2 = N'العربي 日本語'
SELECT #foo, #foo2 -- gives العربي 日本語
--now a varchar literal
SELECT #foo = 'العربي 日本語', #foo2 = 'العربي 日本語'
SELECT #foo, #foo2 --gives ?????? ???
--from my Swiss German keyboard. These are part of my code page.
SELECT #foo = 'öéäàüè', #foo2 = 'öéäàüè'
SELECT #foo, #foo2 --gives ?????? ???
So, apologise to the nice DB monkey... :-)
Always try to use NVARCHAR or NTEXT to store foreign charactesr.
you cannot store UNICODE in varchar ot text datatype.
Also put a N before string value
like
UPDATE [USER]
SET Name = N'日本語'
WHERE ID = XXXX;

Resources