SQL Server nchar column data equality - sql-server

I have a strange problem. In my SQL Server database there is table containing an nchar(8) column. I have inserted several rows into it with different Unicode data for nchar(8) column.
Now when I fire a query like this
select *
from table
where nacharColumnName = N'㜲㤵㠱㠷㔳'
It gives me a row which contains 㤱㔱㄰〴㐰' as unicode data for nchar(8) column.
How does SQL Server compare unicode data?

You should configure your column (or entire database) collation, so that equality and like operators work as expected. Simplified Chinese of whatever

Related

DB collation VS Column collation when INSERTing

I've create 2 demo DB's.
Server Collation - Hebrew_CI_AS
DB1 Collation - Hebrew_CI_AS
DB2 Collation - Latin1_General_CS_AS.
In DB2 I have one column with Hebrew_CI_AS Collation. I'm trying to insert Hebrew text into that column. The Datatype is nvarchar(250).
This is the sample script:
INSERT INTO [Table] (HebCol)
VALUES('1בדיקה')
When I run this on DB1, everything works fine.
On DB2, Although the column has Hebrew Collation, I get question marks instead of the Hebrew text.
Why is the result different if the collation is identical?
P.S: I cannot add N before the text. In the real world an app is doing the inserts.
When using literal strings the collation used is that of the database, not the destination column. As the collation of the database you are inserting into is Latin1_General_CS_AS then for the literal string '1בדיקה' most of the characters are outside of the code page of the collation; thus you get ? for those characters as they are unknown.
As such there are only 2 solutions to stop the ? appearing in the column:
Fix your application and define your literal string(s) as an nvarchar not a varchar; you are after all storing an nvarchar so it makes sense to pass a literal nvarchar.
Change the collation of your database to be the same as your other database, Hebrew_CI_AS.
Technically there is a 3rd, which is use a UTF-8 collation if you are on SQL Server 2019, but such collations come with caveats that I don't think are in scope of this question.

Postgresql VARCHAR column formatting a UNIQUElDENTIFIER value coming from a SQL Server column

There is a transfer happening between two database.
The first one is a SQL Server. The second one is Postgresql.
I have a column that has a UNIQUEIDENTIFIER in SQL Server and it is sending data to a VARCHAR column in Postgresql. The way the code is implemented expects that column to be a varchar/string.
The issue is that the data gets transferred to that column, but has some formatting issues.
The SQL Server UNIQUEIDENTIFIER value: 27E66FD9-79B8-4342-92A9-3CA87E497E69
The Postgresql VARCHAR value: b'27E66FD9-79B8-4342-92A9-3CA87E497E69'
Obviously, I don't want the extra b'' in there. Is there a way to change this in the database without modifying the data type?

Full Text not indexing varbinary column (with html)

I have a table with HTML data, that I want to search using the Full Text Index via an html-filter
So I created an index:
CREATE FULLTEXT CATALOG myCatalog AS DEFAULT
CREATE FULLTEXT INDEX ON myTable (Body TYPE COLUMN Filetype)
KEY INDEX PK_myTable
Body is a varbinary(max) column with HTML. The Filetype column is a computed column returns .html.
No results are being returned.
I verified that .html filter is installed. FullText index is also installed properly and works fine if I convert the column to nvarchar and create just a "plain text" index (not html).
No errors in the SQL log or FTS log.
The keywords table is just empty!
SELECT *
FROM sys.dm_fts_index_keywords
(DB_ID('myDatabase'), OBJECT_ID('myTable'))
All it returns is "END OF FILE" symbol.
It says "document count 35" which mean the documents were processed, but no keywords were extracted.
PS. I have SQL Server Express Edition 2012 (with all advanced features including full text). Can this be the reason? But again, the "plain" full text search works just fine!
PPS. Asked my coworker to test this on SQL Express 2016 - same result... Tried on our production server "Enterprise" edition - same.
UPDATE
OK it turns out the full text index DOES NOT SUPPORT UNICODE!! in varbinary columns. When I converted the column to non-unicode (by converting it to nvarchar then to varchar and then back to varbinary) It started working.
Anyone knows any workarounds?
OK, so it turns out fulltext index DOES support unicode data in varbinary but pay attention to this:
If your varbinary column is created from Nvarchar be sure to include the 0xFFFE unicode signature at the beginning
For example, I'm using a computed column for full text index, so I had to change my computed column to this:
alter table myTable
add FTS_Body as 0xFFFE + (CAST(HtmlBody as VARBINARY(MAX)))
--HtmlBody is my nvarchar column that contains html

String variable is truncated when inserted into an nvarchar(max) column

I take XML from a webservice and it's very long (> 1 million characters). I put the XML in an SSIS variable.
I want to put the raw XML from the variable into a SQL Server 2012 table. Table column is nvarchar(max).
From sql task i use simple
Insert (xml) values (#variable)
However when i look at the column length in SQL Server, only 500k chars there!
Why is this?

2 different collations conflict when merging tables with Sql Server?

I have DB1 which has a Hebrew collation
I also have DB2 which has latin general collation.
I was asked to merge a table (write a query) between DB1.dbo.tbl1 and DB2.dbo.tbl2
I could write in the wuqery
insert into ...SELECT Col1 COLLATE Latin1_General_CI_AS...
But I'm sick of doing it.
I want to make both dbs/tables to the same collation so I don't have to write every time COLLATE...
The question is -
Should I convert latin->hebrew or Hebrew->latin ?
we need to store everything from everything. ( and all our text column are nvarachr(x))
And if so , How do I do it.
If you are using Unicode data types in resulted database - nvarchar(x), then you are to omit COLLATE in INSERT. SQL Server will convert data from your source collation to Unicode automatically. So you should not convert anything if you are inserting to nvarchar column.

Resources