insert and select chinese word from oracle database - database

I am trying to insert chinese word into oracle database but anyhow it failed to display properly , here is my setting for database
SQL> select * FROM nls_database_parameters where parameter='NLS_NCHAR_CHARACTERSET';
PARAMETER
------------------------------
VALUE
--------------------------------------------------------------------------------
NLS_NCHAR_CHARACTERSET
AL16UTF16
SQL> select * FROM nls_database_parameters where parameter='NLS_CHARACTERSET';
PARAMETER
------------------------------
VALUE
--------------------------------------------------------------------------------
NLS_CHARACTERSET
AL32UTF8
The chinese character will be ".." when I copy/paste into my SQLPLUS to run the query.
The version I using is Oracle Database 11g Release 11.2.0.3.0 - 64bit.
Any hints ?

There can be several issues:
1) The characters can not be stored in the database.
2) The characters may be converted to something else depending on client language settings.
3) The tool you are working with cannot display the characters.
--
1) Check the byte values of the character as it is stored in the table.
select dump(column) from table where ...
The dump function will show the byte values and you can check if the correct bytes are stored in the column.
2) Check the NLS_LANG setting on your client.
3) If the correct bytes are stored in the table and they are not being converted to something else due to a NLS_LANG mismatch between server and client then maybe your tool cannot display the characters correctly. I'm pretty sure SQLPlus has a very limited interface. Maybe SQL Developer can display a wider range of characters.

Related

characters appearing incorrectly even with Unicode source and destination (SSIS)

I am having a codepage unicode/non unicode problem and need expertise to understand it.
In SSIS I am reading data in from a UTF8 encoded text file. The datatypes are all DT_WSTR (unicode string). The destination is NVARCHAR which is also unicode.
Non standard characters such as Ú are not being encoded correctly )appearing as a black box question mark).
If the character appears correctly in the input file, the source is set to DT_WSTR & the destination is nvarchar, why is the character not rendering correctly?
I have tried setting the codepage of the source column to 65001, but in SSIS its only possible to change the codepage on a STR (non unicode) type.
Id appreciate any help in understanding why all unicode fields still cant store a unicode value correctly.
Update from the OP comments
It seems my output is ok if i use Unicode types end to end (input is DT_WSTR, destination column is nvarchar & when extracting again to text, output column is DW_WSTR. The only issue is sql server management studio, which does not seem to be able to render unicode characters correctly in the results of a query, when setting output to grid or text. this is a red herring and the process overall works without issue if this is ignored
Trying to figure out the issue
There is not problem importing unicode characters from flat files to SQL Server destination, the only thing you have to do is the set the flat file encoding as unicode, and the result columns must be NVARCHAR. Based on your question, it looks like you have met the requirements so i can say that:
Unicode Character are imported successfully to SQL Server, but for some reasons SQL Server Management Studio cannot show unicode characters in a grid Results, to check that data is imported correctly, change change the result view to Result To Text.
GoTo Tools >> Options >> Query Results >> Results To Text
In the second reference link i provided they mentioned that:
If you use SSMS for your queries, change to output type from "Grid" to "Text", because depending on the font the grid can't show unicode.
Or you can try to change the Grid Results font, (on my machine, i use Tahoma font and it shows unicode characters normally)
Experiments
You can perform the following test (taken from the links below)
SET NOCOUNT ON;
CREATE TABLE #test
( id int IDENTITY(1, 2) NOT NULL Primary KEY
,Uni nvarchar(20) NULL);
INSERT INTO #test (Uni) VALUES (N'DE: äöüßÖÜÄ');
INSERT INTO #test (Uni) VALUES (N'PL: śćźłę');
INSERT INTO #test (Uni) VALUES (N'JAP: 言も言わずに');
INSERT INTO #test (Uni) VALUES (N'CHN: 玉王瓜瓦甘生用田由疋');
SELECT * FROM #test;
GO
DROP TABLE #test;
Try the following query using Result as Grid and Result as Text options.
References
SQL Server 2012 not showing unicode character in results
sql server 2008 not showing and inserting unicode characters!
Import UTF-8 Unicode Special Characters with SQL Server Integration Services
Microsoft SQL Server Management Studio - query result as text

Missing nvarchar columns when reading SQL Server database table from Oracle

I have a SQL Server database with a table that has a column of nvarchar(4000) data type. When I try to read the data from Oracle through a dblink, I don't see the nvarchar(4000) column. All the other column's data is displayed properly.
Can anyone help me to find the issue here and how to fix it?
Appendix A-1 ...
ODBC Oracle Comment
SQL_WCHAR NCHAR -
SQL_WVARCHAR
NVARCHAR - SQL_WLONGVARCHAR LONG if Oracle DB Character Set = Unicode.
Otherwise, it is not supported
Commonly nvarchar(max) is mapped to SQL_WLONGVARCHAR and this data type can only be mapped to Oracle if the Oracle database character set is unicode.
To check the database character set, please excuet:
select * from nls_parameters;
and have a look at: NLS_CHARACTERSET
UPDATE
NLS_CHARACTERSET needs to be a unicode character set - for example AL32UTF8(Do this if you know what you are doing or ask you r DBA to do it.)
NCHAR character set isn't used as the mapping is to Oracle LONG which uses the normal database character set.
A 2nd solution would be to create on the SQL Server side a view that splits the nvarchar(max) to several nvarchar(xxx) and then to select from the view and to concatenate the content again in Oracle.(If you have problem with changing the character set to unicode then this approach is the beset way to go.)

character encoding for SQL Server to Oracle linked server

This linked server worked fine before we upgraded from SQL Server 2005 to 2008R2, but now it throws this error when querying from certain tables (it still works for other tables):
"linked server "PROD" reported an error. The provider did not give any information about the error...Cannot fetch a row from OLE DB provider "OraOLEDB.Oracle" for linked server "PROD".
I can narrow the problem to one row and when I run this query for that row I get a different error:
select * from openquery( PROD, 'SELECT ID, NAME FROM ITEMS WHERE ID = 5437')
Error:
OLE DB provider "OraOLEDB.Oracle" for linked server "PROD" returned message "01".
OLE DB provider "OraOLEDB.Oracle" for linked server "PROD" returned message "ORA-29275: partial multibyte character".
And I can query the offending NAME column as a DUMP, like this:
select * from openquery( PROD, 'SELECT DUMP(NAME) FROM ITEMS WHERE ID = 5437')
Which returns:
Typ=1 Len=16: 77,73,88,84,69,67,79,32,68,69,32,84,73,68,65,193
then rebuild using SELECT CHAR(77) + CHAR(73) + ..., and I get "MIXTECO DE TIDAÁ". Bottom line, it seems, is that CHAR(193) in the Oracle data is causing my query to fail. But how to fix?
Oracle (https://forums.oracle.com/forums/thread.jspa?threadID=551784) provides this mysterious clue:
ORA-29275: partial multibyte character
Cause: The requested read operation could not complete because a partial multibyte character was found at the end of the input.
Action: Ensure that the complete multibyte character is sent from the remote server and retry the operation. Or read the partial multibyte character as RAW.
However, I don't know how to "Ensure..." and I don't know how to "read... as RAW".
SQL Server is a 64-bit version running on a 64-bit windows server 2008R2 system and has the 64-bit Oracle 11gR2 client installed.
column in SQL: NAME nvarchar(60) NULL
column in Oracle: NAME varchar2(60)
In SQL, sp_helpsort returns:
Latin1-General, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive for Unicode Data, SQL Server Sort Order 52 on Code Page 1252 for non-Unicode Data
In Oracle, the NLS_CHARACTERSET is: AL32UTF8
Any help re: why this is not working or how to get this working? Let me know if need further info.
The 193 stored in the Oracle database is not a valid character in the UTF-8 character set. UTF-8 encodes the first 128 characters (0-127) using a single byte but anything beyond 7-bit ASCII requires two or more bytes of storage. Whatever application inserted this data appears to be doing so incorrectly, most likely because it is misconfigured to bypass the character set conversion that is supposed to happen when data is transferred between the client and the database.
What language/ framework/ API is the application that inserted the data into the Oracle database using? What is the client NLS_LANG parameter?

How can I insert binary file data into a binary SQL field using a simple insert statement?

I have a SQL Server 2000 with a table containing an image column.
How do I insert the binary data of a file into that column by specifying the path of the file?
CREATE TABLE Files
(
FileId int,
FileData image
)
I believe this would be somewhere close.
INSERT INTO Files
(FileId, FileData)
SELECT 1, * FROM OPENROWSET(BULK N'C:\Image.jpg', SINGLE_BLOB) rs
Something to note, the above runs in SQL Server 2005 and SQL Server 2008 with the data type as varbinary(max). It was not tested with image as data type.
If you mean using a literal, you simply have to create a binary string:
insert into Files (FileId, FileData) values (1, 0x010203040506)
And you will have a record with a six byte value for the FileData field.
You indicate in the comments that you want to just specify the file name, which you can't do with SQL Server 2000 (or any other version that I am aware of).
You would need a CLR stored procedure to do this in SQL Server 2005/2008 or an extended stored procedure (but I'd avoid that at all costs unless you have to) which takes the filename and then inserts the data (or returns the byte string, but that can possibly be quite long).
In regards to the question of only being able to get data from a SP/query, I would say the answer is yes, because if you give SQL Server the ability to read files from the file system, what do you do when you aren't connected through Windows Authentication, what user is used to determine the rights? If you are running the service as an admin (God forbid) then you can have an elevation of rights which shouldn't be allowed.

How to convert chinese characters to AL16UTF16 or WE8ISO8859P1?

I have inserted into database some chinese characters. (Column name is NAME, data type is VARCHAR2)
My project name is: 中文版测试 and I need to select project by this name.
But.
In oracle database are inserted 中文版测试 with name : ÖÐÎÄ°æ²âÊÔ (If I understand right my database has a set with the name WE8ISO8859P1)
I want to convert this characters from database (ÖÐÎÄ°æ²âÊÔ) to chinese characters (中文版测试) or to a same values to compare.
I try this:
select DIRNAME from MILLENNIUM.PROJECTINFO where UPPER(convert(NAME, 'AL32UTF8', 'we8iso8859p1')) = UPPER(convert('中文版测试', 'WE8MSWIN1252', 'AL32UTF8'));
I need to compare values from oracle with the name of the project.
Oracle settings:
NLS_CHARACTERSET WE8ISO8859P1 0
NLS_NCHAR_CHARACTERSET AL16UTF16 0
AS Michael O'Neill already pointed out it is not possible to store Chinese characters in character set WE8ISO8859P1. All unsupported characters are automatically replaced by ¿ (or any other place holder)
BTW, WE8ISO8859P1 is different to WE8MSWIN1252 (see What is the exact difference between Windows-1252(1/3/4) and ISO-8859-1?), so your conversion does not work anyway.
Solution is to change data type of column NAME to NVARCHAR2 or migrate your database to UTF-8, see Character Set Migration and Database Migration Assistant for Unicode Guide. In any case you should consider your data being lost, resp. corrupted.
However, in case your client application was configured wrongly then in certain circumstances it is possible to insert unsupported characters, see If we have US7ASCII characterset why does it let us store non-ascii characters?.
In such case you can try to repair your data as this:
ALTER TABLE PROJECTINFO ADD NAME_CN NVARCHAR2(100);
UPDATE PROJECTINFO SET NAME_CN = UTL_I18N.RAW_TO_NCHAR(UTL_I18N.STRING_TO_RAW(NAME), 'ZHS16CGB231280');
ALTER TABLE PROJECTINFO DROP COLUMN NAME;
ALTER TABLE PROJECTINFO RENAME COLUMN NAME_CN TO NAME;
select DIRNAME from MILLENNIUM.PROJECTINFO where NAME = '中文版测试';
but it may not work for all of your data.
Hence a (not recommended) workaround for your problem could be
select DIRNAME
from MILLENNIUM.PROJECTINFO
where UTL_I18N.RAW_TO_NCHAR(UTL_I18N.STRING_TO_RAW(NAME), 'ZHS16CGB231280') = '中文版测试';
You cannot take Chinese characters, insert them into a column that is bound by the WE8ISO8859P1 character set and then select them ever again as Chinese characters. You have lost information on your insert. That lost information cannot be reconstituted.
In your case, the NAME column if it were defined as NVARCHAR2, you could do a AL16UTF16 to AL16UTF16 comparison in a subsequent SELECT. Or, even better, not need to convert and compare with AL16UTF16 at all if your client tool is up to the task.

Resources