Sybase/HPUX to MSSQL/Linux - sql-server

We have a Sybase (15.5) server running on HPUX, and I want to migrate the data to a MSSQL 2017 (CU1) on RHEL 7.3.
I'm trying data export/import via bcp using '-c' (ascii) option.
Everything seems fine except for hebrew characters, a 'א' is originally encoded ascii value 224 (Sybase is using iso_1) but the character is modified to ascii value 133 (MSSQL uses SQL_Latin1_General_CP1255_CS_AS collation).
Does someone got a clue about this issue ?

Well, after successfully testing the DirectConnect Odbc drivers it appears that this is a MS drivers limitation.

Related

Configure charset for ODBC Driver 17 for SQL Server

I'm running a Windows application on Linux under Wine, that accesses a SQL Server using the ODBC Driver 17 for SQL Server, for Linux.
It runs fine except that I see incorrectly represented the varchars with non-Ascii characters. The nvarchar fields (unicode strings) have no problem.
Example:
select rtrim('Presentación ')
Returns: Presentación
My database has the encoding for varchars defined as iso8859-1, and Wine seems to use the cp1252 page code.
My guess is that the ODBC driver for Linux retrieves correctly the data and transforms them to UTF8, which runs fine (I can see the values correctly if I run my queries directly through isql), but when those strings are passed to my application, under Wine, they must be considered as cp1252 and that's when I see them incorrectly.
Has anyone had the same problem? what could I try?
Thank you.

RODBC in Ubuntu truncates text strings to 255 characters

I am using RODBC installed on Ubuntu 16.0.4, and I am porting my Windows-based R project/package to this Linux environment. I am running into the issue where sqlQuery returns only the first 255 characters of a text string from an MS SQL Server database. I have found many references to this issue, and I have changed the column type in the database to nvarchar(3500) to presumably solve this issue. This was not a problem in the Windows environment. I cannot seem to get around this 255 character limit, in spite of many folks saying that changing the column variable type to nvarchar(4000) or less, would solve this. I've tried many things, including the cast(...as nvarchar(1000)), for instance, to no avail.
Where am I going wrong?
I was using FreeTDS. I switched to native MS SQL Server drivers, and this fixed the issue. I do not know where the problem lies, but replacing FreeTDS with the MS drivers for SQL server did the trick.

Character set mismatch on Linux with ODBC to SQL Server

I've got a funny issue trying to insert non-ASCII characters into a SQL Server database, using the Microsoft ODBC driver for Linux. The problem is it seems to be assuming different character sets when sending and receiving data. For info, the server collation is set to Latin1_General_CI_AS (I'm only trying to insert European accent characters).
Testing with tsql (which came with FreeTDS), everything is fine. On startup, it outputs the following:
locale is "en_GB.utf8"
locale charset is "UTF-8"
using default charset "UTF-8"
I can both insert and select a non-ASCII value into a table.
However, using my own utility which uses the ODBC API, it's not working. When I do a select query, the data comes back in UTF-8 character set as desired. However if I insert UTF-8 characters, they get corrupted.
SQL > update test set a = 'Béthune';
Running SQL: update test set a = 'Béthune'
Query executed OK: 1 affected rows
SQL > select * from test;
Running SQL: select * from test
+------------+
| a |
+------------+
| Béthune |
+------------+
If I instead insert the data encoded in ISO-8859-1, then that works correctly, however the select query will still return it encoded in UTF-8!
I've already got the locale set to en_GB.utf8, and a client charset of UTF-8 in the database connection details. Aargh!
FWIW I seem to be getting the same problem whether I use the FreeTDS driver or the official Microsoft driver.
EDIT: Just realised one relevant point, which is that in this test program, it isn't using a prepared statement with bound variables. In other words, the update SQL is passed directly into the SQLPrepare call. Something in ODBC is definitely doing an iconv translation, but evidently not to the correct character set!
#0 0x0000003d4c41f850 in iconv () from /lib64/libc.so.6
#1 0x0000003d4d83fd94 in ?? () from /usr/lib64/libodbc.so.2
#2 0x0000003d4d820465 in SQLPrepare () from /usr/lib64/libodbc.so.2
I'll try compiling my own UnixODBC to see better what's going on.
EDIT 2: I've built UnixODBC from source to debug what it's doing, and the problem is nl_langinfo(CODESET) reports back ISO-8859-1. That is strange, since the man page for it says it's the same string you get from locale charmap, which returns UTF-8. I'm guessing that's the problem but still not sure how to solve.
A colleague at work has just figured out the solution for FreeTDS at least.
For a direct driver connection (SQLDriverConnect()), adding ClientCharset=UTF-8;ServerCharset=CP1252; to the connection string fixed the problem
For a connection via the driver manager (SQLConnect()), I can add these lines to the connection settings in odbc.ini:
client charset = UTF-8
server charset = CP1252
Can't yet figure out a solution using the Microsoft driver ...
A solution for Microsoft ODBC Driver might be to set a proper value into the LANG environment variable.
Make sure you have your required locale installed and configured. Also make sure that the LANG environment variable is set correctly for the user you are running your application under. This might be tricky for daemons. For example to make it work for PHP with Apache2 I had to add export LANG=en_US.utf8 into /etc/apache2/envvars.

Sybase ASE 12.5 database : arabic data shown in latin letters

Good day,
i have a Sybase ASE 12.5 database on windows NT server
the database default charachterset is CP850
i'm trying to connect to it using "TOAD for sybase" ,which is on my windows 7 machine
whatever character set i choose for TOAD (utf8,cp1256..), the data are shown in latin letters instead of arabic
i tried disabling the "server character set conversion" ,and disabling the client side conversion,but still no hope
any ideas how to solve this?
CP850 is the character set for Western Europe, so that would explain the latin. If the character set used by the client does not match what is used in the server, then it defaults to English.
You need to change the character set of the server to match what you wish to use for the client, or install the UTF character set in the Server to allow Unicode use.
The Sybase ASE documentation explains the details of charactersets.
the problem were in the server itself, it was corrupted during cloning.
thanks for all the answers

SQL Server 2000 charset issues

Once again with the charset issues when talking to DB's :)
I have two enviroments running Zend Server. Bot of these communicate to a SQL Server 2000 using the mssql extension. None of them has any value given for the charset in the settings of the extension. For one it works and for the other one it returns data in the wrong encoding.
The problem became noticed when this data was beeing inserted into a MySQL database and it screamed with SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xF6m' for column 'cust_lastname' at row 1.
I tried using SET NAMES utf8 to get the SQL Server connection to return the correct data, but it complains and says that NAMES is not a recognized SET statement. Looking around most people even recommend using this but it doesn't seem to be part of SQL Server 2000 :)
So, what should I do? How do I, WITHOUT fiddling with the SQL Server database/tables, tell it to send me the data in UTF-8 encoded format?
EDIT:
Some more info...
SQL Server uses the Finnish_Swedish_CI_AS collation
MySQL has every table in UTF-8 format and uses utf8_unicode_ci
I didn't find a good solution and ended up converting to and from utf8 in my application. If this is encapsulated within a class it doesn't riddle the code. But a way to actually tell the SQL server which encoding to use during communication would be better.

Resources