How to keep trailing space from source to destination using SSIS - sql-server

Environment
Source: Oracle database via OLE DB
Destination: SQL Server 2019 via OLE DB
Tools: SSIS Visual Studio 2019
Problem
The source has value with space (e.g. 12345 ) but loaded into target database space is gone (e.g. 12345)
I want to keep all spaces in source data and input the same into the target table but cannot find the configuration or any way to keep those spaces.

The problem is data type on target table.
Oracle uses VARCHAR2 and CHAR datatypes, and SSIS define them to Unicode string.
The first time, I declared a column for character with VARCHAR in SQL Server but found extra space missing after ETL.
I have changed data type in SQL Server from VARCHAR to NVARCHAR, that can resolve my problem.

Related

How to change DT_STR to DT_WSTR by default in SSIS for Oracle source

We have an SSIS package on our Virtual Machine(assume this as VM1) where we are pulling data from Oracle source. The data type in Oracle for the column is Varchar2 and here in SSIS it's pulling as DT_WSTR data type and storing the data as NVarchar column.
When I open the same package from different Virtual Machine(assume this as VM2), the SSIS package is pulling as DT_STR data type and the package is failing due to conversion error in the validation phase of SSIS package. I'm also getting a warning which is pasted below when I click on columns in Data Flow Task of Oracle source SSIS package.
Warning - Cannot retrieve the column code page info from the OLE DB
provider. If the component supports the "DefaultCodePage" property,
the code page from that property will be used. Change the value of
the property if the current string code page values are incorrect. If
the component does not support the property, the code page from the
component's locale ID will be used.
We have Oracle Java(JDK) and Oracle client installed on both VM1 and VM2.
The OS on our VMs is Windows 7 and SSIS packages are of Visual Studio 2013 on both VMs.
I have had to deal with similar datatype issues between Oracle and SSIS. And with SSIS being so finicky about datatypes, I had to find a solution to implement on the Oracle side.
Before I explain my answer, I should mention that I use the Attunity Connectors for Oracle from Microsoft. I highly recommend using these connectors over the default connects Microsoft and Oracle provide.
So, with that said, I have found two techniques that seem to work to pull data over in the correct encoding. SSIS is really bad at reading and translating metadata from the Oracle system, but explicitly CASTing to a VARCHAR2, even if the column is already a VARCHAR2, seems to be enough of a hint that SSIS knows that column will be a DT_STR type. In all of my Oracle Source tasks, I use a SQL Command rather than just choosing the table (it's a best practice), and that allows me to add in the CAST to the query. For a VARCHAR2 column, I'd do something like this:
SELECT CAST("PO Number" AS VARCHAR2(30)) AS "PONumber" FROM TABLE1
This will usually be enough. But sometimes it won't be, because Oracle allows for some weird characters in a VARCHAR2 column. If you see the error [Oracle Source [2345]] Error: OCI error encountered. ORA-29275: partial multibyte character even after explicity CASTing your column to VARCHAR2, this is due to a code page mismatch. To correct it you can CONVERT the character encoding of the string like this:
SELECT CONVERT("PO Number",'AL32UTF8','WE8MSWIN1252') AS "PONumber" FROM TABLE1
AL32UTF8 is the default (Unicode) encoding that Oracle uses, and WE8MSWIN1252 is the default (ASCII 1252) encoding used by Windows systems.

SSIS Oracle Source not pulling through Unicode characters correctly

Problem
I'm creating an SSIS package to move data from Oracle to SQL Server but am having trouble with certain Unicode characters.
Source Oracle database column is NVARCHAR2 and an example character is U+0102 but also applies to other characters. These will be migrated to an NVARCHAR column in SQL Server but the issue seems to be at the point of extraction as when I preview the source in SSIS the characters in question just show as inverted question marks e.g.
DEMO¿DEMO
Setup
I'm using the Attunity - Oracle Source task/connection as couldn't get the OLE DB connection working
Oracle Database has NLS_NCHAR_CHARACTERSET AL16UTF16
Things I've tried
Changing the DefaultCodePage value in Advanced settings of the Source task to 1252, 65001, 1200, 1201
Converting the source column in the SQL command text in various ways, E.G.: Convert(SOURCE_COLUMN,'AL32UTF8')
Using UTL_RAW.Convert_To_Raw in the SQL Command text. This generates the correct binary values (as DT_BYTES in SSIS), but I couldn't then transform it back into a DT_WSTR using either Data Conversion or Derived Column.
I've tested extracting the same characters from a SQL Server database and they are appearing correctly in the SSIS preview window just to rule out an SSIS issue.
I'm using the SQL Command access mode as per below:
SELECT SOURCE_COLUMN
FROM SOURCE_TABLE;
Any help greatly appreciated.

SSIS cannot convert between unicode and non-unicode string data types

I am new to SSIS, and have been asked to migrate data from a Postgres database, to SQL. There is no data transformation requirements. But I have hit an early snag.
I've create a System DSN on my machine for an ODBC driver to connect to the Postgres database, and it connects fine.
In SSIS, I created an ODBC Data Provider connection, and used the DSN I created. Connects well. I then create an ADO.Net data source, and select my source data connection. I select the table I want this flow to pull from, press Preview, and I see the data. All good.
I then create an OLE DB Destionation component. I can't use an ADO.Net Destination, as one of my fields is a 'GEOMETRIC' type, and SSIS seems to have an issue with that. OLE DB allows it.
I connect the Source to the Destination, and then open the Destination item. I select the Destination table, select 'Mappings' and all is mapped perfectly.
I close that screen, but I see a red cross next to my Destination component.
"Column "pid" cannot convert between unicode and non-unicode string
data types."
This seems to be the case with ALL VARCHAR fields in my database. Do I need to create a Data Conversion or something for every table, and every VARCHAR in my database? Or is there a way to fix this?
To convert Unicode and Non-Unicode string datatypes use the Data Conversion Task and for every problematic source Unicode add String[DT_STR]

Converting FoxPro Date type to SQL Server 2005 DateTime using SSIS

When using SSIS in SQL Server 2005 to convert a FoxPro database to a SQL Server database, if the given FoxPro database has a date type, SSIS assumes it is an integer type. The only way to convert it to a dateTime type is to manually select this type. However, that is not practical to do for over 100 tables.
Thus, I have been using a workaround in which I use DTS on SQL Server 2000 which converts it to a smallDateTime, then make a backup, then a restore into SQL Server 2005.
This workaround is starting to be a little annoying.
So, my question is: Is there anyway to setup SSIS so that whenever it encounters a date type to automatically assume it should be converted to a dateTime in SQL Server and apply that rule across the board?
Update
To be specific, if I use the import/export wizard in SSIS, I get the following error:
Column information for the source and the destination data could not be retrieved, or the data types of source columns were not mapped correctly to those available on the destination provider.
Followed by a list of a given table's date columns.
If I manually set each one to a dateTime, it imports fine.
But I do not wish to do this for a hundred tables.
You could make a small FoxPro program that will loop through your list of tables and write out a SQL INSERT INTO statement for each record to a .sql file which you could then open from or paste into SQL Management Studio and execute. You could then control the date formats that will work with SQL Server's date type fields.
Something similar could be done in c#.

Character set issues with Oracle Gateways, SQL Server, and Application Express

I am migrating data from a Oracle on VMS that accesses data on SQL Server using heterogeneous services (over ODBC) to Oracle on AIX accessing the SQL Server via Oracle Gateways (dg4msql). The Oracle VMS database used the WE8ISO8859P1 character set. The AIX database uses WE8MSWIN1252. The SQL Server database uses "Latin1-General, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive for Unicode Data, SQL Server Sort Order 52 on Code Page 1252 for non-Unicode Data" according to sp_helpsort. The SQL Server databases uses nchar/nvarchar or all string columns.
In Application Express, extra characters are appearing in some cases, for example 123 shows up as %001%002%003. In sqlplus, things look ok but if I use Oracle functions like initcap, I see what appear as spaces between each letter of a string when I query the sql server database (using a database link). This did not occur under the old configuration.
I'm assuming the issue is that an nchar has extra bytes in it and the character set in Oracle can't convert it. It appears that the ODBC solution didn't support nchars so must have just cast them back to char and they showed up ok. I only need to view the sql server data so I'm open to any solution such as casting, but I haven't found anything that works.
Any ideas on how to deal with this? Should I be using a different character set in Oracle and if so, does that apply to all schemas since I only care about one of them.
Update: I think I can simplify this question. SQL Server table uses nchar. select dump(column) from table returns Typ=1 Len=6: 0,67,0,79,0,88 when the value is 'COX' whether I select from a remote link to sql server, cast the literal 'COX' to an nvarchar, or copy into an Oracle table as an nvarchar. But when I select the column itself it appears with extra spaces only when selecting from the remote sql server link. I can't understand why dump would return the same thing but not using dump would show different values. Any help is appreciated.
There is an incompatibility between Oracle Gateways and nchar on that particular version of SQL Server. The solution was to create views on the SQL Server side casting the nchars to varchars. Then I could select from the views via gateways and it handled the character sets correctly.
You might be interested in the Oracle NLS Lang FAQ

Resources