creating a SSIS package with a SQL query to Oracle - sql-server

I have about had it with SQL Server 2012 64 bit!
I am creating a SSIS package with a SQL query to Oracle and trying to put the file into a flat file. I am using the Oracle OLEDB source and a Flat File Destination for the output. Everything works fine locally, but when put on the server and run through SQL Agent I keep getting the Unicode to Non-unicode errors!
The latest drivers are on the server and the 11g client is on my development machine. The types shown in each step show as DT-STR.
I have the exact same source writing to an OLEDB destination just fine. I don't want to have to write these to a table and then pull them back out just to get this to work. Any solutions? And please, no "just add this" responses.
I have tried a data conversion, but get same result. Please supply DETAILED answers as in go here and change this to this. Pictures never hurt. Thanks

Short answer is that you need to convert codepages, not datatypes.
Long answer follows:
Step 0: If you're not already using it, I highly recommend that you switch to using the Attunity Connectors instead of stock Oracle OLEDB connector. You can download it for SQL2012 from Microsoft at: https://www.microsoft.com/en-us/download/details.aspx?id=29283
Step 1: Using the Attunity Oracle Source, you can specify a SQL Command as the data access mode, instead of just pointing at a table.
Step 2: You need to determine the exact code page the Oracle server is using, and the exact code page your SQL server is using. For Oracle servers using the UTF-8 character set, this is most likely AL32UTF8 and for a Windows Server using the default ANSI-1252 char set, the code page is WE8MSWIN1252.
Step 3: Write your plsql query and CONVERT the codepage of all the columns on the Oracle side. Make sure you use double quotes around the Oracle column names. It should look something like this:
SELECT
CONVERT("Data Source Code",'AL32UTF8','WE8MSWIN1252') AS DataSourceCode
,CONVERT("Order#",'AL32UTF8','WE8MSWIN1252') AS OrderNumber
,CONVERT("Invoice#",'AL32UTF8','WE8MSWIN1252') AS InvoiceNumber
,CONVERT("Item#",'AL32UTF8','WE8MSWIN1252') AS ItemNumber
,CONVERT("Order Line Type",'AL32UTF8','WE8MSWIN1252') AS OrderLineType
,CONVERT("Order Status",'AL32UTF8','WE8MSWIN1252') AS OrderStatus
,CONVERT("Order Date",'AL32UTF8','WE8MSWIN1252') AS OrderDate
,CONVERT("Invoice Date",'AL32UTF8','WE8MSWIN1252') AS InvoiceDate
,CONVERT("Ship To Cust#",'AL32UTF8','WE8MSWIN1252') AS ShipToCustNumber
,CONVERT("Billing Account #",'AL32UTF8','WE8MSWIN1252') AS BillingAccountNumber
,CONVERT("Sold Qty",'AL32UTF8','WE8MSWIN1252') AS SoldQty
,CONVERT("Unit Price",'AL32UTF8','WE8MSWIN1252') AS UnitPrice
,CONVERT("Sales Amount",'AL32UTF8','WE8MSWIN1252') AS SalesAmount
,CONVERT("Handling Amount",'AL32UTF8','WE8MSWIN1252') AS HandlingAmount
,CONVERT("Freight Amount",'AL32UTF8','WE8MSWIN1252') AS FreightAmount
FROM MYORACLEDB.DIGITAL_SALES_FEED
WHERE "Invoice Date" >= TO_DATE('2018/08/01', 'yyyy/mm/dd')
Step 4: Use this query text as the SQL command text in the Oracle Source configuration window.
Fun Fact: Oracle will return the column names in ALL CAPS, regardless of your AS ColumnName format.
Step 5 (optional): All columns will be returned as strings. You might want to put a Data Conversion task in your Data Flow, but if you're just dumping your data into a flatfile, you might not care about the data conversion. I have decimal/numeric and dates in my data set, so I do a conversion before inserting into SQL Server.

Related

On SSMS, ODBC linked server queries show unicode text data

In SSMS I'm connected to an Intersystems Cache database using ODBC driver and linked server When I fetch data using a SQL query like
SELECT Text FROM OPENQUERY([ODBC_CACHE_DB],'SELECT TOP 100 Text FROM cls.Actions')
IN SSMS it gives results but it gives ? for arabic characters like
"18:29:00 [Mohamad] ????? ??? ?? ??? ??? ?????? ????? ? 18:30:30 [Customer] Hi Sirius is jai"
how could get arabic texts ?
note: I can read and write arabic text with using nvarchar data type
Had a similar issue. My setup was a linked server setup between MSSQL 2012 cluster and Intersystems Cache 2009.x using MS OLE ODBC provider.
My observations below:
Convert/Cast on the column with nvarchar datatype did not work -- as in it shows the ???? (This is on SSMS)
When using 3rd Party DB management tools such as Database.net and WinSQL, I was able to see the correct characters.
Playing around with the ODBC driver's Unicode SQL Types function only intermittently helped show the correct characters.
The solution:
Enable Unicode SQL Types function on the ODBC driver
Make changes to the test sql query that is being executed on the Intersystems Cache db. If you keep executing the same query, the output is cached for sometime (not sure how long exactly).
In my case, the sql server cluster was not under my control and took a few days to play around with the different variations.

SSIS - SQL Server datetimeoffset(0) destination column recognized as DT_WSTR

We get data delivered to us in a flat file. A date column we want to store in a destination column called DWValidFrom has the following format:
2017-02-06T22:07:09Z
In SSIS using a Flat File Connection Manager, I set the datatype of said column to DT_DBTIMESTAMPOFFSET. It correctly shows us when checking the data in the Columns and Preview pages of the Connection Manager.
In SQL Server, I created the destination table, and defined the DWValidFrom column as datetimeoffset(0):
[DWValidFrom] [datetimeoffset](0) NOT NULL,
When I attempt to set the mappings in the OLE DB Destination object, which has been set to the SQL Server table in question, SSIS won't have it, and throws the following error:
The OLE DB provider used by the OLE DB adapter cannot convert between types "DT_DBTIMESTAMPOFFSET" and "DT_WSTR" for "DWValidFrom".
Suspecting something off with my regional settings, I issued the following query in Management Studio to ensure the format of the date wouldn't change:
SELECT CAST('2017-02-06T22:07:09Z' AS datetimeoffset(0))
This yielded the following result:
2017-02-06 22:07:09 +00:00
Why is SSIS not recognizing the column's proper data type? I do not have any other conversions or expressions set, so I'm confused as to why SSIS won't allow me to push a valid datetimeoffset.
We're using SQL Server 2014, Visual Studio 2015.
Thanks.
This sounds like the OLEDB source metadata is out of sync with the changes you made on the flat file connection manager. The quickest fix it would be to recreate the OLEDB source, but don't do that quite yet.
SSIS is not going to like that standard ISO format for the date. If you remove the "T" in the middle and the "Z" at the end it be ok. i.e.
2017-02-06 22:07:09
Because of this conversion issue in SSIS, the connection manager will probably fail in converting the string to datetimeoffset. So you will need to configure it as a string and then fix it's value in a derived column:
(DT_DBTIMESTAMPOFFSET, 0) REPLACE(REPLACE( [DWValidFrom] , "T", " " ), "Z", "")
Hope that helps,
m
The issue seemed to be that the OLEDB destination does not recognize datetimeoffset as a valid column format. Despite everything working in SQL Server and SSIS pushing a datetime that would be perfectly valid, the OLEDB destination wouldn't have any of it.
I considered using a SQL Server destination, but because the target server is a different server than the one we develop on, that wasn't an option either.
The fix for us was to instead format the columns using datetime as a datatype, which causes us to loose the timezone info, but because all of the dates were UTC, we really don't miss any data.
Quick Answer: Set DataTypeCompatibility to 0
I noticed in Connection Manager for my SQL Server Native Client 11.0 (OLEDB) connection, clicking on "All", then under the SQLNCLI11.1 section there's a value DataTypeCompatibility which was set to "80". 80 is code for SQL Server 2000 compatibility, well before they introduced TimeStampOffset (or in my case DT_DBDATE and DT_DBTIME2 types). I tried setting compatibility to 130, then 100, but "Test Connection" failed.
At https://learn.microsoft.com/en-us/sql/relational-databases/native-client/applications/using-connection-string-keywords-with-sql-server-native-client?view=sql-server-2017 there's a table, specifying information about this value
DataTypeCompatibility SSPROP_INIT_DATATYPECOMPATIBILITY Specifies the mode of data type handling to use. Recognized values are "0" for provider data types and "80" for SQL Server 2000 data types.
Changing the value to 0, then refreshing all of my connections using the OLEDB connection manager seems to have done the trick - now all my database's types are recognized rather than forcing it to nvarchar/DT_WSTR

SSIS cannot convert between unicode and non-unicode

I've an SSIS package which works perfectly on various server but doesn't on 2 newly added server.
It uses OLEDB (64bit, I've certified that) to get data from an ORACLE DB from a table with 3 VARCHAR2 columns and upsert the rows to a SQL 2012 table with 3 nvarchar columns, with the same size.
The package was working flawlessly in all the servers we deployed it. We recently added 2 new servers, with hopefully (I personally checked the possible major culprit: ODAC, dtexec version, tns names ) the same software with the same versions but the package is not working anymore. It gives us the famous error:
Description: Column "C_CODE" cannot convert between unicode and non-unicode s
tring data types.
End Error
Error: 2016-08-26 08:33:26.20
Code: 0xC02020F6
Source: Upsert Lines OLE DB Source [87]
Description: Column "S_DESC" cannot convert between unicode and non-unicode s
tring data types.
End Error
Warning: 2016-08-26 08:33:26.20
Code: 0x800470C8
Source: Upsert Lines OLE DB Source [87]
Description: The external columns for OLE DB Source are out of synchronizatio
n with the data source columns. The external column "C_CODE" needs to be updated
.
The external column "S_DESC" needs to be updated.
End Warning
The first warning I got was the "Cannot read the code page from Oracle server" so I tried to put the "use the default one every time" in combination with ANSI or UTF-8 code pages ( used this website to get the codepages: https://msdn.microsoft.com/en-us/library/windows/desktop/dd317756%28v=vs.85%29.aspx ).
The only way I made it work was to manually modify the OLE DB Source block into the package by changing the external columns to ansi strings, making it not compilable in the dev environment, then put a data transform that would change the exernal columns to utf-8 string again.
This way it doesn't work in the other servers but does flawlessly in the "faulty" servers.
I was thinking that the fact the package cannot read the code page from Oracle server is the culprit but I couldn't figure out how to fix it.
The problem is not like the million of questions about this issue you can find anywhere. The error being given is from the OLE DB Source block. I know about the data conversion.
Thanks in advance
The issue was the NLS_LANG registry key's value, it was set to an incorrect value.
I solved by deleting the said key, all worked fine.
Just come across this post (and many others about this issue) and believe I have found the underlying cause. On my servers at least, the 64bit version of the Windows Oracle drivers has a bug in the install process. I have checked the 32bit version and this bug is not there which explains the behaviour.
It appears that the 64bit version creates the NLS_LANG registry key in the wrong location. By default the 64bit driver create it here - "HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE"
I manually created it here - "HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\KEY_OraClient12Home1" which cured my issue. For reference my NLS_LANG key is set to "ENGLISH_UNITED KINGDOM.WE8MSWIN1252". I just created it as a copy of the string in the branch above.
I hope this helps others.
We use 32 bit SSIS with Oracle instant client, I need to add entry in this registration location
"Computer\HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\ORACLE\KEY_OraClient12Home1_32bit"
The “NLS_LANG” value is “AMERICAN_AMERICA.WE8MSWIN1252”. This solved my problem.

How to fix the embedded text qualifier issue while exporting data to CSV flat file?

###RFC 4180:
RFC 4180 defines Common Format and MIME Type for Comma-Separated Values (CSV) Files. One of the requirements of the RFC 4180 is stated as below. This is the point #7 in the RFC link.
If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote. For example:
"aaa","b""bb","ccc"
###SQL Server 2000:
DTS Export/Import Wizard in SQL Server 2000 seems to conform to the above mentioned standards even though the RFC 4180 itself seem to have been published only on October 2005. I am using the below stated SQL Server 2000 version.
Microsoft SQL Server 2000 - 8.00.2039 (Intel X86)
May 3 2005 23:18:38
Copyright (c) 1988-2003 Microsoft Corporation
Standard Edition on Windows NT 5.0 (Build 2195: Service Pack 4)
###SQL Server 2012:
SQL Server Import and Export Wizard in SQL Server 2012 does not export the data from table to CSV file according to the standard defined in RFC 4180. I am using the below stated SQL Server 2012 version.
Microsoft SQL Server 2012 - 11.0.2316.0 (X64)
Apr 6 2012 03:20:55
Copyright (c) Microsoft Corporation
Enterprise Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1) (Hypervisor)
###Issue Simulation:
Here is a sample that I ran in both SQL Server 2000 and SQL Server 2012. I ran the below query to create a table and insert few records. The ItemDesc column has data with double-quotes in it. My intention is to export the data from both these SQL Server versions using their in-built export data wizard and compare the generated CSV files.
CREATE TABLE dbo.ItemInformation(
ItemId nvarchar(20) NOT NULL,
ItemDesc nvarchar(100) NOT NULL
)
GO
INSERT INTO dbo.ItemInformation (ItemId, ItemDesc) VALUES ('100338754', 'Crown Bolt 3/8"-16 x 1" Stainless-Steel Hex Bolt');
INSERT INTO dbo.ItemInformation (ItemId, ItemDesc) VALUES ('202255836', 'Simpson Strong-Tie 5/8" SSTB Anchot Bolt');
INSERT INTO dbo.ItemInformation (ItemId, ItemDesc) VALUES ('100171631', 'Grip-Rite #11 x 1-1/2" Electro-Galvanized Steel Roofing Nails');
INSERT INTO dbo.ItemInformation (ItemId, ItemDesc) VALUES ('202210289', 'Crown Bolt 1/2" x 3" "Zinc-Plated" Universal Clevis Pin');
INSERT INTO dbo.ItemInformation (ItemId, ItemDesc) VALUES ('100136988', 'Tapcon 3/16" x 1-3/4" Climaseal Steel "Flat-Head" Phillips Concrete Anchors (75-Pack)');
INSERT INTO dbo.ItemInformation (ItemId, ItemDesc) VALUES ('203722101', 'KwikTap 3/16" x 2-1/4" "Flat-Head" Concrete Screws (100-Pack)');
GO
On the DTS Export/Import Wizard in SQL Server 2000, I used the below settings to export the data to CSV file. I saved the file under the name SQLServer2000_ItemInformation.csv.
On the SQL Server Import and Export Wizard in SQL Server 2012, I used the below settings to export the data to CSV file. I saved the file under the name SQLServer2012_ItemInformation.csv.
Here is the comparison between the two files using Beyond Compare. The left side contains the file generated by SQL Server 2000 and the right side contains the file generated by SQL Server 2012. You can notice that the left side file from SQL Server 2000 contains additional double-quotes to compensate the embedded quotes in the data column. This conforms to the standard specified in RFC 4180 but it is clearly missing from the file generated by SQL Server 2012
###Searches on the web:
I searched for this bug on the web and found the following links. Following are the bug reports on Microsoft Connect. All these issues seem to be related to importing a file but nothing about exporting data. All these bugs have been closed as Fixed.
SSIS flat file parser does not read Column delimiters embedded in text data
Flat File Connection Manager not handling Text Delimiters in CSV Files
Embedded quotes in Flat File Import fails
BUG: Flat File Connection Manager: multiple-character text qualifier does not load all data
Below post on MSDN blog states that changes have been made in SQL Server 2012 with respect to Flat file source supports embedded qualifiers and a variable number of columns per row
SSIS - What’s New in SQL Server Denali
Another post on MSDN blog states the same under the section Embedded Qualifiers.
Flat File Source Changes in Denali
###Workaround that I know of:
I know a workaround to fix the issue by writing a query that would replace all double-quotes (") in my column data with two double-quotes ("") so that the exported file will end up with correct embedded qualifier data. This would avoid pulling the data directly from the table as it is.
###My questions:
I don't know if this issue has been truly fixed in SQL Server 2012. Has this issue been fixed only for importing files that have embedded text qualifiers and not for exporting data to CSV?
Probably, I am clearly doing something wrong and missing the obvious. Could someone please explain to me what I am doing wrong here?
###Microsoft Connect:
I have submitted a bug report on Microsoft Connect website to get their feedback. Here is the link to the bug report. If you agree that this is a bug, please visit the below link to vote up on Microsoft Connect website.
Embedded text qualifier during export to CSV does not conform to RFC 4180
I wouldn't offer this answer except that you worked so hard to document it and it's been upvoted with no answer after a month. So, here goes. Your only choices appear to be to change the data or change the tool.
Probably, I am clearly doing something wrong and missing the obvious. Could someone please explain to me what I am doing wrong here?
When the tool is broken and the vendor doesn't care, it's mistake to keep trying. It's time to switch. You put a lot of effort into researching exactly how it's broken and demonstrating it violates not only the RFC but the tool's own prior version. How much more evidence do you need?
CSV is a boat anchor too. If you have the option, you're better off using an ordinary delimited file format. For lots of applications, tab-delimited is good. The best delimiter IMO is '\' because that character has no place in English text. (On the other hand it won't work for data containing Windows pathnames.)
CSV has two problems as an exchange format. First, it's not all that standard; different applications recognize different versions, whatever the RFC may say. Second (and related) is that it doesn't constitute a regular language in CS terms, which is why it can't be parsed as a regular expression. Compare with ^([^\t]*\t)*[\t]*$ for a tab-delimited line. The practical implication of the complexity of CSV's definition is (see above) the relative dearth of tools to handle them and their tendency to be incompatible, particularly during the wee hours.
If you give CSV and DTS the boot, you have good options, one of which is bcp.exe. It's very fast, and safe because Microsoft hasn't been tempted to update it for years. I don't know much about DTS, but in case you have to use it for automation, IIRC there is a way to invoke external utilities. Beware though, that bcp.exe does not return error status to the shell dependably.
If you're determined to use DTS and to stick with CSV, then really your best remaining option is to write a view that prepares the data appropriately for it. I would, if backed into that corner, create a schema called, say, "DTS2012CSV", so that I could write select * from DTS2012CSV.tablename, giving anyone who cares a fighting chance to understand it (because you'll document it, won't you, in comments in the view text?). If need be, others can copy its technique for other broken extracts.
HTH.
I know this is two years old, but I am also now having this issue, as we need to use SQL Server 2008 for a contract we have (don't ask). After reading through this question, I realized I needed to do the replace suggestion, but when I went to do it in the query, I ran into truncation issues, because using the replace() function in the query itself would convert the text to a varchar(8000) by default.
However, I discovered I could do the same thing using a Derived Column step in between the DB Source and Flat File objects. For example, I have a column named "short_description," that could have quotes in it, so I just used the following function as the expression, and selected "Replace short_description" in the Derived Column:
REPLACE(short_description,"\"","\"\"")
This seems to have solved the issue for me.
Often the first and last name is in the same field and formatted (Last, First). This needs to be text qualified if you're using Tasks->Export Data right off the database (not via SSIS where you have more options) and you need to export to CSV as comma-delimited file.
This will help in your non-null selected fields that need double quoting...
CASE WHEN NOT PersonName IS NULL AND LEN(PersonName) > 0 THEN QUOTENAME(PersonName, '"') ELSE NULL END as 'PersonName'
Result:
PersonName
"COLLINS, ZACKERY E"

SQL Server 2000 charset issues

Once again with the charset issues when talking to DB's :)
I have two enviroments running Zend Server. Bot of these communicate to a SQL Server 2000 using the mssql extension. None of them has any value given for the charset in the settings of the extension. For one it works and for the other one it returns data in the wrong encoding.
The problem became noticed when this data was beeing inserted into a MySQL database and it screamed with SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xF6m' for column 'cust_lastname' at row 1.
I tried using SET NAMES utf8 to get the SQL Server connection to return the correct data, but it complains and says that NAMES is not a recognized SET statement. Looking around most people even recommend using this but it doesn't seem to be part of SQL Server 2000 :)
So, what should I do? How do I, WITHOUT fiddling with the SQL Server database/tables, tell it to send me the data in UTF-8 encoded format?
EDIT:
Some more info...
SQL Server uses the Finnish_Swedish_CI_AS collation
MySQL has every table in UTF-8 format and uses utf8_unicode_ci
I didn't find a good solution and ended up converting to and from utf8 in my application. If this is encapsulated within a class it doesn't riddle the code. But a way to actually tell the SQL server which encoding to use during communication would be better.

Resources