How to properly insert some special characters like ¤ to Oracle database - database

I have an Oracle database. I have to execute some insert scripts that fill a NVARCHAR2 column. The insert statements include some special character like ¤
Insert into myTable (column1, column2) values(1, 'ABC-XYZ-SET-00985203-INS-01_2_10 | (100-BASE¤BASE ART¤3) | (100-SHELL¤SHELL ART¤1) |');
When I run the query all the special character is replaced with ¿ symbol
Database NLS parameters are as follows:
NLS_NCHAR_CHARACTERSET UTF8
NLS_CHARACTERSET WE8ISO8859P1
I've the special character multiple times repeated in a string.
What should I do for inserting the character I mentioned above ¤ appropriately?

First of all, you need to set also NLS_LANG on your client that supports unicode characters.
Then you can use N' nchar literals: https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/Literals.html#GUID-192417E8-A79D-4A1D-9879-68272D925707
Insert into myTable (column1, column2)
values(1, N'ABC-XYZ-SET-00985203-INS-01_2_10 | (100-BASE¤BASE ART¤3) | (100-SHELL¤SHELL ART¤1) |');
Notice N before `

The safest way to insert non-ASCII characters might be to use the UNISTR function. Using the Unicode encoding value instead of the actual character is less convenient, but also less likely to be misinterpreted by whatever future programs run your code.
Insert into myTable (column1, column2) values
(
1,
'ABC-XYZ-SET-00985203-INS-01_2_10 | (100-BASE'||unistr('\00A4')||'BASE ART'||unistr('\00A4')||
'3) | (100-SHELL'||unistr('\00A4')||'SHELL ART'||unistr('\00A4')||'1) |'
);

Related

how to use condition select statement some data including character "\t" in Cassandra cql

I insert data to Cassandra table using datastax library and spring framework
I use separator character "\t" to concat two strings
this.pid + "\t" + this.cid;
If I select inserted data, I can see that data is saved
select * from table1;
pid | cid | key | value | update_time
-----+-----------+------+-------+--------------------------
1 | data1 | key1 | 01\tdata1 | 2019-xx-xx
2 | data2 | key2 | 02\tdata2 | 2019-xx-xx
But I cannot select that data with select statement
select * from table1 where pid=1 AND cid='data1' AND key='key1' AND value='1\tdata1';
should I escape '\t' character?
below is table scheme
CREATE TABLE table1 (
pid int,
cid text,
key text,
value text,
update_time timestamp,
PRIMARY KEY (pid, cid, key, value)
)
there is difference between inserted data by cql and java
I use Mac OS, when I select on Mac console, I can see that "\t" character color is different
inserted data using java
enter image description here
inserted data using cql
enter image description here
You already answered yourself in one of the comments, but for posterity I want to explain what's going on:
First, You are using the "cqlsh" tool to do the SELECT request. This tool tries to be friendly by converting non-printable characters into the traditional Unix representation (see, e.g., "cat -v") so for example the tab character is converted, on printing to a "\t". But the actual character stored in the database is a tab (a character with ASCII value 9), not "\t".
The second thing is that the CQL itself does not support these "escaped" characters. Things like \t or \011 do not have any special meaning in CQL, and "\t" is simply the two characters backslash and t! You need to use the actual tab character in the query string. In Java code (or other modern languages), it's easy, you can use \t in a string constant and Java (not Cassandra) will convert it into an actual tab inside that query string. But if you're using cqlsh for the query, nobody will do this conversion for you; You'll need to actually type that tab. Because cqlsh has special handling of tabs, you need to use control-V and tab, to tell cqlsh to really put an actual tab. I see you already discovered this solution yourself.
You are using pid as string but that is an integer - try below query and it will work -
select * from table1 where pid=1 AND cid='data1' AND key='key1' AND value='01\tdata1';

UTF-8 characters get saved as ?? on insert, but gets saved correctly on update

I have a table on MS SQLServer with an nVarchar column. I am saving a UTF-8 character using an insert statement. It gets saved as ???. If I update the same column using the same value via an update statement, it gets saved correctly.
Any hint on what would be the issue here? The collation used is : SQL_Latin1_General_CP1_CI_AS
Show your insert statement. There is - quite probably - an N missing:
DECLARE #v NVARCHAR(100)='Some Hindi from Wikipedia मानक हिन्दी';
SELECT #v;
Result: Some Hindi from Wikipedia ???? ??????
SET #v=N'Some Hindi from Wikipedia मानक हिन्दी';
SELECT #v;
Result: Some Hindi from Wikipedia मानक हिन्दी
The N in front of the string literal tells SQL-Server to interpret the content as unicode (to be exact: as ucs-2). Otherwise it will be treated as a 1-byte-encoded extended ASCII, which is not able to deal with all characters...

SQL Server Linked Server to PostgreSQL Turkish Character Issue

I have added a PostgreSQL linked server to my SQL Server with help from this blog post. My problem is when I use the query below, I am having problems with Turkish characters.
Query on Microsoft SQL Server 2012:
SELECT *
FROM OpenQuery(CARGO, 'SELECT taxno ASACCOUNTNUM, title AS NAME FROM view_company');
Actual results:
MUSTAFA ÞAHÝNALP
Expected results:
MUSTAFA ŞAHİNALP
The problem is that the source encoding is 8-bit Extended ASCII using Code Page 1254 -- Windows Latin 5 (Turkish). If you follow that link, you will see the Latin5 chart of characters to values. The value of the Ş character -- "Latin Capital Letter S with Cedilla" -- is 222 (Decimal) / DE (Hex). Your local server (i.e. SQL Server) has a default Collation of SQL_Latin1_General_CP1_CI_AS which is also 8-bit Extended ASCII, but using Code Page 1252 -- Windows Latin 1 (ANSI). If you follow that link, you will see the Latin1 chart that shows the Þ character -- "Latin Capital Letter Thorn" -- also having a value of 222 (Decimal) / DE (Hex). This is why your characters are getting translated in that manner.
There are a few things you can try:
Use sp_serveroption to set the following two options:
EXEC sp_serveroption #server=N'linked_server_name',
#optname='use remote collation',
#optvalue=N'true';
EXEC sp_serveroption #server=N'linked_server_name',
#optname='collation name',
#optvalue=N'Turkish_100_CI_AS';
Not sure if that will work with PostgreSQL as the remote system, but it's worth trying at least. Please note that this requires that all remote column collations be set to this particular value: Turkish / Code Page 1254.
Force the Collation per each column:
SELECT [ACCOUNTNUM], [NAME] COLLATE Turkish_100_CI_AS
FROM OPENQUERY(CARGO, 'SELECT taxno AS ACCOUNTNUM, title AS NAME FROM view_company');
Convert the string values (just the ones with character mapping issues) to VARBINARY and insert into a temporary table where the column is set to the proper Collation:
CREATE TABLE #Temp ([AccountNum] INT, [Name] VARCHAR(100) COLLATE Turkish_100_CI_AS);
INSERT INTO #Temp ([AccountNum], [Name])
SELECT [ACCOUNTNUM], CONVERT(VARBINARY(100), [NAME])
FROM OPENQUERY(CARGO, 'SELECT taxno AS ACCOUNTNUM, title AS NAME FROM view_company');
SELECT * FROM #Temp;
This approach will first convert the incoming characters into their binary / hex representation (e.g. Ş --> 0xDE), and then, upon inserting 0xDE into the VARCHAR column in the temp table, it will translate 0xDE into the expected character of that value for Code Page 1254 (since that is the Collation of that column). The result will be Ş instead of Þ.
UPDATE
Option # 1 worked for the O.P.

Multi-language support

We have developed a site that needs to display text in English, Polish, Slovak and Czech. However, when the text is entered into the database, any accented letters are changed to english letters.
After searching around on forums, I have found that it is possible to put an 'N' in front of a string which contains accented characters. For example:
INSERT INTO Table_Name (Col1, Col2) VALUES (N'Value1', N'Value2')
However, the site has already been fully developed so at this stage, going through all of the INSERT and UPDATE queries in the site would be a very long and tedious process.
I was wondering if there is any other, much quicker, way of doing what I am trying to do?
The database is MSSQL and the columns being inserted into are already nvarchar(n).
There isn't any quick solution.
The updates and inserts are wrong and need to be fixed.
If they were parameterized queries, you could have simply made sure they were using the NVarChar database type and you would not have a problem.
Since they are dynamic strings, you will need to ensure that you add the unicode specifier (N) in front of each text field you are inserting/updating.
Topic-starter wrote:
"text in English, Polish, Slovak and Czech. However, when the text is entered into the database, any accented letters are changed to english letters" After searching around on forums, I have found that it is possible to put an 'N' in front of a string which contains accented characters. For example:
INSERT INTO Table_Name (Col1, Col2) VALUES (N'Value1', N'Value2')
"The collation for the database as a whole is Latin1_General_CI_AS"
I do not see how it could happen due to SQL Server since Latin1_General_CI_AS treats european "non-English" letters:
--on database with collation Latin1_General_CI_AS
declare #test_multilanguage_eu table
(
c1 char(12),
c2 nchar(12)
)
INSERT INTO #test_multilanguage_eu VALUES ('éÉâÂàÀëËçæà', 'éÉâÂàÀëËçæà')
SELECT c1, cast(c1 as binary(4)) as c1bin, c2, cast(c2 as binary(4)) as c2bin
FROM #test_multilanguage_eu
outputs:
c1 c1bin c2 c2bin
------------ ---------- ------------ ----------
éÉâÂàÀëËçæà 0xE9C9E2C2 éÉâÂàÀëËçæà 0xE900C900
(1 row(s) affected)
I believe you simply have to check checkboxes them Control Panel --> Regional and Language Options --> tab Advanced --> Code page conversion tables and check that you render in the same codepage as you store it.
Converting to unicode from encodings used by clients would lead to problems to render back to webclients, it seems to me.
I believe that most European collation designators use codepage 1252 [1], [2].
Update:
SELECT
COLLATIONPROPERTY('Latin1_General_CI_AS' , 'CodePage')
outputs 1252
[1]
http://msdn.microsoft.com/en-us/library/ms174596.aspx
[2]
Windows 1252
http://msdn.microsoft.com/en-us/goglobal/cc305145.aspx

How to Show Eastern Letter(Chinese Character) on SQL Server/SQL Reporting Services?

I need to insert chinese characters in my database but it always show ???? ..
Example:
Insert this record.
微波室外单元-Apple
Then it became ???
Result:
??????-Apple
I really Need Help...thanks in regard.
I am using MSSQL Server 2008
Make sure you specify a unicode string with a capital N when you insert like:
INSERT INTO Table1 (Col1) SELECT N'微波室外单元-Apple' AS [Col1]
and that Table1 (Col1) is an NVARCHAR data type.
Make sure the column you're inserting to is nchar, nvarchar, or ntext. If you insert a Unicode string into an ANSI column, you really will get question marks in the data.
Also, be careful to check that when you pull the data back out you're not just seeing a client display problem but are actually getting the question marks back:
SELECT Unicode(YourColumn), YourColumn FROM YourTable
Note that the Unicode function returns the code of only the first character in the string.
Once you've determined whether the column is really storing the data correctly, post back and we'll help you more.
Try adding the appropriate languages to your Windows locale setings. you'll have to make sure your development machine is set to display Non-Unicode characters in the appropriate language.
And ofcourse u need to use NVarchar for foreign language feilds
Make sure that you have set an encoding for the database to one that supports these characters. UTF-8 is the de facto encoding as it's ASCII compatible but supports all 1114111 Unicode code points.
SELECT 'UPDATE table SET msg=UNISTR('''||ASCIISTR(msg)||''') WHERE id='''||id||''' FROM table WHERE id= '123344556' ;

Resources