Eastern Character Set Causes Problems For SQL Server 2012 - sql-server

I have a table with contents:
internalid foreignWord
1 បរិស្ថាន
2 ការអភិវឌ្ឍសហគមន៍
And its schema:
CREATE TABLE [dbo].[CE_testTable](
[internalid] [int] IDENTITY(1,1) NOT NULL,
[foreignWord] [nvarchar](50) NOT NULL
If I run:
SELECT TOP 1000 [internalid] ,[foreignWord] FROM CE_testTable where foreignWord = N'ការអភិវឌ្ឍសហគមន៍'
I get:
internalid foreignWord
1 បរិស្ថាន
2 ការអភិវឌ្ឍសហគមន៍
Which is both rows, it should have only returned the row with "ការអភិវឌ្ឍសហគមន៍" which is "community development" in Cambodian.
It is a NVARCHAR column and I'm selecting where N' etc? Any ideas?

Change the collation to Latin1_General_100_CI_AS.
You can specify collation for each column when you create the tables.
If you don't specify collation the columns will have the same collation that the database has.
CREATE TABLE [dbo].[CE_testTable](
[internalid] [int] IDENTITY(1,1) NOT NULL,
[foreignWord] [nvarchar](50) collate Latin1_General_100_CI_AS NOT NULL
)
SQL Fiddle

Try the query without the N
SELECT [internalid],
[foreignWord]
FROM CE_testTable
WHERE foreignWord = 'ការអភិវឌ្ឍសហគមន៍'

I seems like SQL Server can't do what I'm asking of it.
Looking at the comment from Erland Sommarskog, this explains my situation. It stores OK, and I can see the rows in there. But comparisons may fail. They did. So its a design level problem. I can't have different collations on the table, so I can't compare. For me this is not a problem it was only a PK that was erroring, I can work around.

Related

Collate setting on a column in SQL Server table not working

I have a table in SQL Server 2008 database hosted on a shared web hosting. I cannot change the collation of the database because I don't have permissions.
When I created the table, I set the collation for the columns that I want but it doesn't do anything and I still see ???? when I query the table. I tried nvarchar as well and it didn't work.
The table:
CREATE TABLE [dbo].[T_Client]
(
[ClientID] [int] IDENTITY(1,1) NOT NULL,
[ClientName] [varchar](200) collate Hebrew_CI_AI null ,
[Address] [varchar](200) collate Hebrew_CI_AI null
)
You must ensure that the data is passed all the way to SQL Server using a format with compatible code points. Since you don't have Hebrew as your database or instance collation a varchar variable can't be used to store the data. So this
declare #d varchar(100) = 'שלום לך עולם' collate Hebrew_CI_AI
select #d
outputs
???? ?? ????
In this scenario you have to pass the value to the databse as NVARCHAR
declare #d nvarchar(100) = N'שלום לך עולם' collate Hebrew_CI_AI
select #d
You could use a varchar column with a Hebrew collation to store the data, but you should just use an nvarchar column. Still use the collation to produce the desired sorting and comparision semantics.
The problem is your INSERT/UPDATE statements. Unless you define those values as an nvarchar then the characters outside the databases collation will be lost. This means you need to declare your parameters as an nvarchar. As a result I would suggest, instead, not changing the collation of the columns and changing them as an nvarchar and using an nvarchars throughout your code.

SQLServer 2005 does not do the right thing

I have a problem when trying with Like statement like this:
First I have the data sheet:
When I execute the Sql command it does not do what I want.
My syntax:
select * from tbUsers where nUserID like N'%p%';
It does not show any results. Although I know that 'Finds any values ​​that have' p 'in any position'
result picture:
my code to create table:
Create table tbUsers(
iIDUser int identity(1,1) not null primary key,
nUserID nvarchar(50) null,
nPassWord nvarchar(50) null,
dDate datetime null,
nName nvarchar(50) null
)
INSERT INTO tbUsers(nUserID,nPassword,nName) VALUES('phuc','123456', 'Phuc Nguyen')
INSERT INTO tbUsers(nUserID,nPassword) VALUES('ngocanh','123456')
INSERT INTO tbUsers(nUserID,nPassword) VALUES('long','123456')
INSERT INTO tbUsers(nUserID,nPassword) VALUES('long%ngocanh','123456')
INSERT INTO tbUsers(nUserID,nPassword) VALUES('phuc nguyen','123456')
Please help me. Thank you.
Hi your problem can be your collation if you need the Vietnamese collation for any reason you can alter your query to use the collation in your query like this one:
select *
from tbUsers
where nUserID collate SQL_Latin1_General_CP1_CI_AS like N'%p%';
If not my recommendation is to re-create the database using the collation SQL_Latin1_General_CP1_CI_AS since this query will be slow.
Also take in consideration if you have an index in the user column using double %% this will not let your index to be used. If you use only one % the index will be activated. Take a look of the execution plan to review this.
If want to stay with the Vietnamese collation maybe change the collation to the columns you need for this type of functionality. This will help you with the performance.
To change the collation of a column use
ALTER TABLE MyTable ALTER COLUMN Column1 [TYPE] COLLATE [NewCollation]
You can take a look to this question for more details
How to set collation of a column with SQL?
Since you are using Vietnamese collation you are not getting back the rows. You can specify the collation in your query quite easily though and it will return the rows you are looking for.
select *
from tbUsers
where nUserID collate SQL_Latin1_General_CP1_CI_AS like N'%p%';

How to update a table if a column exists in SQL Server?

I have a table MyTable created by
CREATE TABLE MyTable
(
[ID] [bigint] IDENTITY(1,1) NOT NULL,
[Type] [int] NOT NULL,
[CreatedDate] [datetime] NOT NULL,
[ModifiedDate] [datetime] NOT NULL,
)
I want to check if a column exists in my table, and if it does, I want to copy the data to a different column, then drop the old column, like this:
IF (SELECT COLUMNPROPERTY(OBJECT_ID('MyTable'), 'Timestamp', 'Precision')) IS NOT NULL
BEGIN
UPDATE [dbo].[MyTable]
SET [CreatedDate] = [Timestamp]
ALTER TABLE [dbo].[MyTable]
DROP COLUMN [Timestamp]
END
GO
However, when I try to run this I get an error:
Invalid column name 'Timestamp'
How can I accomplish what I'm trying to do?
This is a compilation issue.
If the table doesn't exist when you compile the batch all works fine as the statements referencing the table are subject to deferred compile. However for a preexisting table you will hit this problem as it tries to compile all statements and balks at the non existent column.
You can push the code into a child batch so it is only compiled if that branch is hit.
IF (SELECT COLUMNPROPERTY(OBJECT_ID('MyTable'), 'Timestamp', 'Precision')) IS NOT NULL
BEGIN
EXEC('
UPDATE [dbo].[MyTable]
SET [CreatedDate] = [Timestamp]
ALTER TABLE [dbo].[MyTable]
DROP COLUMN [Timestamp]
')
END
GO
If you are just trying to rename the column
EXEC sys.sp_rename 'dbo.MyTable.[TimeStamp]' , 'CreatedDate', 'COLUMN'
Would be easier though (from a position where the CreatedDate column doesn't exist).
You have to first create the [Timestamp] column with an ALTER TABLE statement.
Then the rest should run.
EDIT based on comment (I know this info is duplicated elsewhere on SO, but I couldn't find it):
Ok, the IF condition in SQL Server unfortunately does not allow you to ignore code that does not parse. What is happening is that SQL Server is looking at your command, and parsing every statement to make sure it is valid.
When it does this, SQL Server isn't smart enough to figure out that the invalid statement (the UPDATE that requires the presence of [TimeStamp]) will not be reached if there is no [TimeStamp].
In other words, you can't write a SQL command that expects a column that doesn't exist EVEN IF you nest that command in an IF condition that won't be reached. SQL Server will parse the entire statement and not allow it to run BEFORE it tests the IF condition.
A commonly used Work arounds for this is Dynamic SQL, which SQL Server can't pre-parse, so it won't try.

Querying a varbinary column in SQL Server

I have some issues with querying varbinary columns using the contains predicate (it only works on nvarchar/varchar but on the msdn documentation it is specified that it works on image/varbinary also)
I have this table
[dbo].[Documents]
(
[id] [int] IDENTITY(1,1) NOT NULL,
[title] [nvarchar](100) NOT NULL,
[doctype] [nchar](4) NOT NULL,
[docexcerpt] [nvarchar](1000) NOT NULL,
[doccontent] [varbinary](max) NOT NULL,
CONSTRAINT [PK_Documents]
PRIMARY KEY CLUSTERED ([id] ASC)
)
doctype - document type (format)
docexcerpt - small fragment of the document
doccontent - whole document stored in varbinary
Code:
INSERT INTO dbo.Documents (title, doctype, docexcerpt, doccontent)
SELECT
N'Columnstore Indices and Batch Processing',
N'docx',
N'You should use a columnstore index on your fact tables, putting all columns of a fact table in a columnstore index. In addition to fact tables, very large dimensions could benefit from columnstore indices as well. Do not use columnstore indices for small dimensions. ',
bulkcolumn
FROM
OPENROWSET(BULK 'myUrl', SINGLE_BLOB) AS doc;
Now this is how it looks like :
I have installed the Microsoft Office 2010 Filter Packs and registered them in SQL Server and checked if what I need (.docx) is installed using
SELECT document_type, path
FROM sys.fulltext_document_types;
Here's the output
My issue is that this query doesn't return anything :
As an observation, I have created a fulltext catalog and index on my table using the following code(s), making both docexcerpt and doccontent index-able columns
--fulltext index
CREATE FULLTEXT INDEX ON dbo.Documents
(
docexcerpt Language 1033,
doccontent TYPE COLUMN doctype Language 1033
STATISTICAL_SEMANTICS
)
KEY INDEX PK_Documents
ON DocumentsFtCatalog
WITH STOPLIST = SQLStopList,
SEARCH PROPERTY LIST = WordSearchPropertyList,
CHANGE_TRACKING AUTO;
I'm not sure what am I doing wrong/missing. I'd appreciate any help. Thanks
I've managed to 'solve' the mistery, well.... I forgot that I had to re-insert my documents into my tables (after editing them) in order for my queries to work properly. Can't believe I've been so numb.

Sql server query using function and view is slower

I have a table with a xml column named Data:
CREATE TABLE [dbo].[Users](
[UserId] [int] IDENTITY(1,1) NOT NULL,
[FirstName] [nvarchar](max) NOT NULL,
[LastName] [nvarchar](max) NOT NULL,
[Email] [nvarchar](250) NOT NULL,
[Password] [nvarchar](max) NULL,
[UserName] [nvarchar](250) NOT NULL,
[LanguageId] [int] NOT NULL,
[Data] [xml] NULL,
[IsDeleted] [bit] NOT NULL,...
In the Data column there's this xml
<data>
<RRN>...</RRN>
<DateOfBirth>...</DateOfBirth>
<Gender>...</Gender>
</data>
Now, executing this query:
SELECT UserId FROM Users
WHERE data.value('(/data/RRN)[1]', 'nvarchar(max)') = #RRN
after clearing the cache takes (if I execute it a couple of times after each other) 910, 739, 630, 635, ... ms.
Now, a db specialist told me that adding a function, a view and changing the query would make it much more faster to search a user with a given RRN. But, instead, these are the results when I execute with the changes from the db specialist: 2584, 2342, 2322, 2383, ...
This is the added function:
CREATE FUNCTION dbo.fn_Users_RRN(#data xml)
RETURNS nvarchar(100)
WITH SCHEMABINDING
AS
BEGIN
RETURN #data.value('(/data/RRN)[1]', 'varchar(max)');
END;
The added view:
CREATE VIEW vwi_Users
WITH SCHEMABINDING
AS
SELECT UserId, dbo.fn_Users_RRN(Data) AS RRN from dbo.Users
Indexes:
CREATE UNIQUE CLUSTERED INDEX cx_vwi_Users ON vwi_Users(UserId)
CREATE NONCLUSTERED INDEX cx_vwi_Users__RRN ON vwi_Users(RRN)
And then the changed query:
SELECT UserId FROM Users
WHERE dbo.fn_Users_RRN(Data) = #RRN
Why is the solution with a function and a view going slower?
the point of the view was to pre-compute the XML value into a regular column. To then use that precomputed value in the index on the view, shouldn't you actually query the view?
SELECT
UserId
FROM vwi_Users
WHERE RRN= '59021626919-61861855-S_FA1E11'
also, make the index this:
CREATE NONCLUSTERED INDEX cx_vwi_Users__RRN ON vwi_Users(RRN) INCLUDE (UserId)
it is called a covering index, since all columns needed in the query are in the index.
Have you tried to add that function result to your table (not a view) as a persisted, computed column??
ALTER TABLE dbo.Users
ADD dbo.fn_Users_RRN(Data) PERSISTED
Doing so will extract that piece of information from the XML, store it in a computed, always up-to-date column, and the persisted flag makes it physically stored along side the other columns in your table.
If this works (the PERSISTED flag is a bit iffy in terms of all the limitations it has), then you should see nearly the same performance as querying any other string field on your table... and if the computed column is PERSISTED, you can even put an index on it if you feel the need for that.
Check the query execution plan and confirm whether or not the new query is even using the view. If the query doesn't use the view, that's the problem.
How does this query fair?
SELECT UserId FROM vwi_Users
WHERE RRN = '59021626919-61861855-S_FA1E11'
I see you're freely mixing nvarchar and varchar. Don't do that! It can cause full index conversions (eeeeevil).
Scalar functions tend to perform very poorly in SQL Server. I'm not sure why if you make it a persisted computed column and index it, it doesn't have identical performance to a normal indexed-column, but it may be due to the UDF being called even though you think it's no longer needed to be called once the data is computed.
I think you know this from another answer, but your final query is wrongly calling the scalar UDF on every row (defeating the point of persisting the computation):
SELECT UserId FROM Users
WHERE dbo.fn_Users_RRN(Data) = #RRN
It should be
SELECT UserId FROM vwi_Users
WHERE RNN = #RRN

Resources