Problem
I'm creating an SSIS package to move data from Oracle to SQL Server but am having trouble with certain Unicode characters.
Source Oracle database column is NVARCHAR2 and an example character is U+0102 but also applies to other characters. These will be migrated to an NVARCHAR column in SQL Server but the issue seems to be at the point of extraction as when I preview the source in SSIS the characters in question just show as inverted question marks e.g.
DEMO¿DEMO
Setup
I'm using the Attunity - Oracle Source task/connection as couldn't get the OLE DB connection working
Oracle Database has NLS_NCHAR_CHARACTERSET AL16UTF16
Things I've tried
Changing the DefaultCodePage value in Advanced settings of the Source task to 1252, 65001, 1200, 1201
Converting the source column in the SQL command text in various ways, E.G.: Convert(SOURCE_COLUMN,'AL32UTF8')
Using UTL_RAW.Convert_To_Raw in the SQL Command text. This generates the correct binary values (as DT_BYTES in SSIS), but I couldn't then transform it back into a DT_WSTR using either Data Conversion or Derived Column.
I've tested extracting the same characters from a SQL Server database and they are appearing correctly in the SSIS preview window just to rule out an SSIS issue.
I'm using the SQL Command access mode as per below:
SELECT SOURCE_COLUMN
FROM SOURCE_TABLE;
Any help greatly appreciated.
I have come across a SSIS loading issue where one particular column sometimes produces some unwanted character which i'm not aware of and the loading of data into SQL server fails. Now this data comes from a ORACLE database. Data is extracted from oracle database in a normal text format on Solaris platform. This is then brought across to a windows sql server platform for loading into SQL server db.
please find attached image showing the issue highlighted in yellow:
Is there any possibility of escaping this character from the oracle database when extracting or escaping during loading into SQL Server database via SSIS package?
I accidentally deleted YEARS of data in SQL Server Management Studio from a table. Unfortunately the person in this position before me didn't back anything up. Nor did I before I tried to fix an issue. I understand that it cannot be retrieved from SQL but I have all the data I need in a separate file on my desktop. Is their anyway to get that data and input it back into the table that is in SQL? Or is there a query I can run to input the data again into the table? I'm not sure if I am making any sense :/
You can also used Management Studio without SSIS. Right click on the database in MS and select Tasks -> Import Data. You should then be able to select the type of source (flat file) and the format. The rest of the wizard is pretty self-explanatory.
If it is a flat file like .txt or .csv or even an Excel file like(.xls), you can build an SSIS package and dump the data to a new table. Depends, on what kind of data you have in your hand.
Is it possible to compress data (programmatically) in SQL Server using zlib compression? It's just one column in a specific table that needs compressing.
I'm not a SQL Server person myself - I've tried to find if there's anything available but not had much joy.
Thanks
I have a bunch of UTF-8 encoded flat files that need to be imported into a SQL Server 2008 R2 database. Bulk inserts are not able to identify the diameters nor seems to accept UTF-8.
I understand that there is a number of articles on how SQL Server 2008 deals with UTF-8 encoding, but I'm sort of looking for any updated answers as most of those articles are old.
Is there anything I can to do in order to get these flat files into the database either by converting them before an insert or a process to run during the insert?
I want to stay away from manually converting each one. Furthermore, SSIS packages that I've attempted to create can read and separate the data. It just can't move the data it seems. :(
The flat files are generated by Java. Converting the java environment from UTF-8 to any other encoding has been unsuccessful.
NOTE
I have no intention of storing UTF-8 data. My delimiter is coming in funky because it's UTF-8. SQL Server cannot read the characters when separating the columns and rows. That's it.
Not true, you simply need to choose code page 65001
convert your data file to UTF-16 Little Endian (exactly Little Endian)
use bcp with -w option.
Just for reference, if someone google it, and falls here like me.
I've tried the accepted answer a dozen times, with no success. In my case, my data file was a .csv flat file, which had a lot of accents characters/letters, like ç é ã á.
I also noted that no matter what encoding I choose, the import was made using the 1251 (ANSI - Latin 1) encoding.
So, the solution was convert before import, my .csv file from UTF-8 to the very same 1251 (ANSI - Latin 1) encoding. I did the conversion using Notepad++.
After converting it, did the regular import (through SSMS Tasks -> "Import Data" wizard), selecting the 1251 (ANSI - Latin 1) encoding, and everything was imported correctly.
Environment:
SQL Server Web 2016
SQL Server Management Studio v17.9.1
Notepad++ v7.7.1
Also, this answers too the original OP's question:
Is there anything I can to do in order to get these flat files into the database either by converting them before an insert or a process to run during the insert?
Because it didn't work at first I want to add to Arthur's answer, as mentioned in the comments by live-love:
You should change the string data types to NVARCHAR.
You do can do this by selecting Unicode string(DT_WSTR) from the Advanced tab and the specified columns.
Microsoft has always been crap regarding encoding, especially in SQL Server. Here is your solution.