SQL error on inserting UTF8 string into SQL Server 2008 table - sql-server

I'm having problems trying to insert strings containing UTF-8 encoded Chinese characters and punctuations into a SQL Server 2008 table (default installation) from my Delphi 7 application using Zeosdb native SQL Server library.
I remembered in the past I had problems inserting UTF8 string into SQL Server even using PHP and other methods so I believe that this problem is not unique to Zeosdb.
It doesn't happen all the time, some UTF8 encoded strings can get inserted successfully but some not. I can't figure out what is it in the string that caused the failure.
Table schema:
CREATE TABLE [dbo].[incominglog](
[number] [varchar](50) NULL,
[keyword] [varchar](1000) NULL,
[message] [varchar](1000) NULL,
[messagepart1] [varchar](1000) NULL,
[datetime] [varchar](50) NULL,
[recipient] [varchar](50) NULL
) ON [PRIMARY]
SQL statement template:
INSERT INTO INCOMINGLOG ([Number], [Keyword], [Message], [MessagePart1], [Datetime], [Recipient])
VALUES('{N}', '{KEYWORD}', '{M}', '{M1}', '{TIMESTAMP}', '{NAME}')
The parameter {KEYWORD}, {M} and {M1} can contain UTF8 string.
For example, the following statement will return an error:
Incorrect syntax near 'é¢'. Unclosed quotation mark after the character string '全力克æœå››ç§å±é™©','2013-06-19 17:07:28','')'.
INSERT INTO INCOMINGLOG ([Number], [Keyword], [Message], [MessagePart1], [Datetime], [Recipient])
VALUES('+6590621005', '题', '题 [全力克æœå››ç§å±é™© åšå†³æ‰«é™¤ä½œé£Žä¹‹å¼Š]', '[全力克æœå››ç§å±é™©','2013-06-19 17:07:28', '')
Note: Please ignore the actual characters as the utf8 encoding is lost after copy and paste.
I've also tried using NVARCHAR instead of VARCHAR:
CREATE TABLE [dbo].[incominglog](
[number] [varchar](50) NULL,
[keyword] [nvarchar](max) NULL,
[message] [nvarchar](max) NULL,
[messagepart1] [nvarchar](max) NULL,
[datetime] [varchar](50) NULL,
[recipient] [varchar](50) NULL
) ON [PRIMARY]
And also tried amending the SQL statement into:
INSERT INTO INCOMINGLOG ([Number],[Keyword],[Message],[MessagePart1],[Datetime],[Recipient]) VALUES('{N}',N'{KEYWORD}',N'{M}',N'{M1}','{TIMESTAMP}','{NAME}')
They don't work either. I would appreciate any pointer. Thanks.
EDITED: As indicated by marc_s below, the N prefix must be outside the single quotes. It is correct in my actual test, the initial statement is a typo, which I've corrected.
The test with the N prefix also returned an error:
Incorrect syntax near '原标é¢'. Unclosed quotation mark after the
character string '全力克�四��险','2013-06-19
21:22:08','')'.
The SQL statement:
INSERT INTO INCOMINGLOG ([Number],[Keyword],[Message],[MessagePart1],[Datetime],[Recipient]) VALUES('+6590621005',N'原标题',N'原标题 [全力克�四��险 �决扫除作风之弊]',N'[全力克�四��险','2013-06-19','')
.
.
REPLY TO gbn's Answer: I've tried using parameterized SQL but still encountering "Unclosed quotation mark after the character string" error.
For the new test, I used a simplified SQL statement:
INSERT INTO INCOMINGLOG ([Keyword],[Message]) VALUES(:KEYWORD,:M)
The error returned for the above statement:
Incorrect syntax near '原标é¢'. Unclosed quotation mark after the
character string '')'.
For info, the values of KEYWORD and M are:
KEYWORD:原标题
M:原标题 [
.
.
.
Further tests on 20th June Parametarized SQL query don't work so I tried a different approach by trying to isolate the character that caused the error. After trial and error, I managed to identify the problematic character.
The following character produces an error: 题
SQL Statement: INSERT INTO INCOMINGLOG ([Keyword]) VALUES('题')
Interestingly, note that the string in the return error tax contains a "?" character which didn't exist in the original statement.
Error: Unclosed quotation mark after the character string '�)'. Incorrect syntax near '�)'.
If I were to place some latin characters immediately after the culprit character, there will be no error. For example, INSERT INTO INCOMINGLOG ([Keyword]) VALUES('题Ok') works ok. Note: It doesn't work with all characters.

There are ' characters in the UTF-8 which abnormally terminate the SQL.
Classic SQL injection.
Use proper parametrisation, not string concatenation basically.
Edit, after Question updates...
Without the Delphi code, I don't think we can help you
All SQL side code works. For example, this works in SSMS
DECLARE #t TABLE ([Keyword] nvarchar(100) COLLATE Chinese_PRC_CI_AS);
INSERT INTO #t ([Keyword]) VALUES('题');
INSERT INTO #t ([Keyword]) VALUES(N'题');
SELECT * FROM #t T;
Something is missing to help us fic this
Also see
How to store UTF-8 bytes from a C# String in a SQL Server 2000 TEXT column
Having trouble with UTF-8 storing in NVarChar in SQL Server 2008
Write utf-8 to a sql server Text field using ADO.Net and maintain the UTF-8 bytes

Related

"CREATE TABLE" statement gives error: unexpected ")" after datatype

Given this SQL:
CREATE TABLE dbo.dtproperties
( [id] int NOT NULL
, [objectid] int NULL
, [property] varchar(64) NOT NULL
, [value] varchar(255) NULL
, [uvalue] nvarchar(510) NULL
, [lvalue] image(16) NULL
, [version] int NOT NULL
)
Trying to run this in an actual shell or with a SQL syntax checker online, it is expecting a closing parenthesis at 8,20, which is located in the whitespace between int and NOT NULL. Here's an image of the checker at sql-format.com:
I'm not actually writing this SQL by hand - it comes from a structure dump from a Ruby on Rails ActiveRecord database connection. The legacy database it's connected to dumps fine, but syntactically does not check out. I've had to programmatically wrap all column names in square brackets because of how often reserved keywords are used as column names. So whatever this issue might be, ideally I'll be able to solve it programmatically.
image(16) should be image as the image datatype has never had a resolution/size.
Ideally you want to change all image datatypes to varbinary(max) as image won't be supported in future.
Reference

SSIS Ole DB Source as Stored procedure from linked server with parameters

all.
I am using stored procedures from linked server. I want to use the procedure in ole db source.
I wrote the query, which works in SSIS.
select ID,
LAST_NAME_ENG,
LAST_NAME_G,
FST_NAME_ENG,
FST_NAME_G,
BIRTHDATE
from openquery (linkedserver,
'exec [linkedserver].get_records #SESSION_ID = 12 , #SYSTEM = ''oCRM'', #ENTITY_NAME = ''CLIENT''
WITH RESULT SETS (([ID] [int] NOT NULL,
[LAST_NAME_ENG] [varchar](50) NOT NULL,
[LAST_NAME_G] [varchar](50) NOT NULL,
[FST_NAME_ENG] [varchar](50) NOT NULL,
[FST_NAME_G] [varchar](50) NOT NULL,
[BIRTHDATE] [date] NOT NULL))');
I can use it in SSIS ole db source and successfully get required data. But in the next step there is the problem:
I need to pass the parameter to the #SESSION_ID from SSIS instead of '12'. And I cannot find the right way to do it.
There are a lot of advices to use dynamics sql and construct full query string with required parameter values and then exec it, but if I will do it - the SSIS couldnt get columns data from the dynamics query.
Are there ways to solve it?
Any ideas will be helpfull.
Thank you.
With regards, Yuriy.
Create a string variable, say, SQL_Query. In variable definition - set EvaluateAsExpression and define expression as "your SQL statement ... #SESSION_ID = " + [User::Session_ID_Variable] + " rest of SQL statement" where Session_ID_Variable contains your conditional value. If Session_ID_Variable is not string - you have to cast it to string with (DT_WSTR, length). Result in SQL_Query will be your target SQL expression.
Then in OLE DB Source - specify variable as SQL command source and select [User::SQL_Query].
Stored procedure have to return resultset of the same format in all cases. SP returning no resultset will fail DataSource.

How can I bulk load this file into SQL Server 2012?

I'm trying to import some data from the popular GeoNames site into SQL Server. It's a tab delimited text file. I didn't think there would be a problem but whatever I do, I just get an error message which says:
The bulk load failed. The column is too long in the data file for row 1, column 4. Verify that the field terminator and row terminator are specified correctly.
This is the file I'm trying to import:
http://download.geonames.org/export/dump/admin2Codes.txt
...and this is my table:
CREATE TABLE [Admin2Codes](
[code] [VARCHAR](20) NOT NULL,
[name] [NVARCHAR](200) NOT NULL,
[asciiname] [NVARCHAR](200) NOT NULL,
[geonameId] [INT] NOT NULL
)
I can't spot what the problem is. It works if I only have one row in the file, but as soon as there's more than one row, it fails. The line endings in the file appear to be \n and that matches my SQL:
BULK INSERT dbo.Admin2Codes FROM 'D:\admin2codes.txt'
WITH(
DATAFILETYPE = 'widechar',
FIELDTERMINATOR = '\t',
ROWTERMINATOR = '\n'
)
GO
Looks like the data uses the line feed as a row terminator that is used in UNIX instead of the carriage return and line feed used in windows. Try this instead:
ROWTERMINATOR = '''+CHAR(10)+'''

SSIS Lookup transformation including whitespace

For some reason, the SSIS Lookup transformation seems to be checking the cache for a NCHAR(128) value instead of a NVARCHAR(128) value. This results in a whole bunch of appended whitespace on the value being looked up and causes the lookup to fail to find a match.
On the Lookup transformation, I configured it to have No Cache so that it always goes to the database so I could trace with SQL Profiler and see what it was looking up. This is what it captured (notice the whitespace ending at the single quote on the second last line - requires horizontal scrolling):
exec sp_executesql N'
select *
from (
SELECT SurrogateKey, NaturalKey, SomeInt
FROM Dim_SomeDimensionTable
) [refTable]
where [refTable].[NaturalKey] = #P1
and [refTable].[SomeInt] = #P2'
,N'#P1 nchar(128)
,#P2 smallint'
,N'VALUE '
,8
Here's the destination table's schema:
CREATE TABLE [dbo].[dim_SomeDimensionTable] (
[SurrogateKey] [int] IDENTITY(1,1) NOT NULL,
[NaturalKey] [nvarchar](128) NOT NULL,
[SomeInt] [smallint] NOT NULL
)
What I am trying to figure out is why SSIS is checking the NaturalKey value as NCHAR(128) and how I can get it to perform the lookup as NVARCHAR(128) without the whitespace.
Things I've tried:
I have LTRIM() and RTRIM() on the SQL Server source query.
Before the Lookup, I have used a Derived Column transformation to add a new column with the original value TRIM()'d (this trimmed column is the one I'm passing to the Lookup transformation).
Before and after the Lookup, I multicasted the rows and sent them to a unicode Flat File Destination and there was no white space in either case.
Before the lookup, I looked at the metadata on the data flow path and it shows the value as having data type DT_WSTR with length 128.
Any ideas would be greatly appreciated!
It doesn't make any difference.
You need to look elsewhere for the source of your problem (perhaps the column has a case sensitive collation for example).
Trailing white space is only significant to SQL Server in LIKE comparisons, not = comparisons as documented here.
SQL Server follows the ANSI/ISO SQL-92 specification (Section 8.2,
, General rules #3) on how to compare strings
with spaces. The ANSI standard requires padding for the character
strings used in comparisons so that their lengths match before
comparing them. The padding directly affects the semantics of WHERE
and HAVING clause predicates and other Transact-SQL string
comparisons. For example, Transact-SQL considers the strings 'abc' and
'abc ' to be equivalent for most comparison operations.
The only exception to this rule is the LIKE predicate...
You can also easily see this by running the below.
USE tempdb;
CREATE TABLE [dbo].[Dim_SomeDimensionTable] (
[SurrogateKey] [int] IDENTITY(1,1) NOT NULL,
[NaturalKey] [nvarchar](128) NOT NULL,
[SomeInt] [smallint] NOT NULL
)
INSERT INTO [dbo].[Dim_SomeDimensionTable] VALUES ('VALUE',8)
exec sp_executesql N'
select *
from (
SELECT SurrogateKey, NaturalKey, SomeInt
FROM Dim_SomeDimensionTable
) [refTable]
where [refTable].[NaturalKey] = #P1
and [refTable].[SomeInt] = #P2'
,N'#P1 nchar(128)
,#P2 smallint'
,N'VALUE '
,8
Which returns the single row

SQL Server 6.5 deadlocks with Spanish letters ú and ü

We're running SQL 6.5 though ADO and we have the oddest problem.
This sentence will start generating deadlocks
insert clinical_notes ( NOTE_ID, CLIENT, MBR_ID, EPISODE, NOTE_DATE_TIME,
NOTE_TEXT, DEI, CARE_MGR, RELATED_EVT_ID, SERIES, EAP_CASE, TRIAGE, CATEGORY,
APPOINTMENT, PROVIDER_ID, PROVIDER_NAME )
VALUES ( 'NTPR3178042', 'HUMANA/PR', '999999999_001', 'EPPR915347',
'03-28-2011 11:25', 'We use á, é, í, ó, ú and ü (this is the least one we
use, but there''s a few words with it, like the city: Mayagüez).', 'APK', 'APK',
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL )
The trigger are the characters ú and ü. If they are in the NOTE_TEXT column.
NOTE_TEXT is a text column.
There are indexes on
UNC_not_id
NT_CT_MBR_NDX
NT_REL_EVT_NDX
NT_SERIES_NDX
idx_clinical_notes_date_time
nt_ep_idx
NOTE_ID is the primary key.
What happens is after we issue this statement, if we issue an identical one, but with a new NOTE_ID value, we receive the deadlock.
As mentioned, this only happens when ú or ü is in NOTE_TEXT.
This is a test server and there is generally only one session accessing this table when the error occurs.
I'm sure it has something to so with character sets and such, but for the life of me I can't work it out.
Is the column (var)char-based or n(var)char-based? Are the values using unicode above 255 or are they ascii 255 or below (250 and 252)?
Try changing the column to a binary collation, just to see if that helps (it may shed light on the issue). I do NOT know if this works in SQL 2000 (though I can check on Monday), but you can try this to find what collations are available on your server:
SELECT * FROM ::fn_helpcollations()
Latin General BIN should be in there somewhere.
Assuming you find a collation to try, you change the collation like so:
ALTER TABLE TableName ALTER COLUMN ColumnName varchar(8000) NOT NULL COLLATE Collation_Name_Here
Script out your table to learn the collation it's using now so you can set it back if that doesn't work or causes problems. Or use a backup. :)
One additional note is that if you're using unicode you do need an N before literal strings, for example:
SELECT N'String'

Resources