SSIS receive Excel column as DT_IMAGE - sql-server

Good day to you, Experts.
I'm stuck on a problem I'm having with an Excel 97-02 .xls file.
When adding it as a source in SSIS, I'm getting an External Columns Datatype of DT_IMAGE .
The column represents an ID and is numeric only. I can't extract and work with the data because of the DT_IMAGE datatype.
Setting IMEX=1 didn't help.
Thank you in advance.

Reading Excel files in SSIS is done using OLEDB provider which may not detect the appropriate Excel column type.
There are many other questions mentioning similar issues such as:
SSIS Excel Import Forcing Incorrect Column Type
SSIS Excel Data Source - Is it possible to override column data types?
SSIS keeps force changing excel source string to float
As you mentioned in the question, if you added ;Extended Properties="IMEX=1" to the connectionstring with no luck then i think there is 4 things you can try:
Sorting column data inside Excel
Change the entire column formatting manually
Go to the advanced editor on the Excel source >> into the output column list and set the type for each of the columns.
Adding IMEX=1; MAXROWSTOSCAN=0 to the connectionstring
If nothing of the above steps worked then you should save the Excel sheet as a text file and then you use Flat File Connection manager

Related

SSIS Data Flow OLE DB To Excel Nvarchar Size Issue

Hopefully, this is not an ignorant question as I am still working to build SSIS Skills.
I Have a package that takes an excel sheet and loads it into an SSMS SQL table so that I can run analysis and update statements to the data. I am now looking to load that SQL table back into an Excel sheet. I have made an excel sheet as a template of a replication of the SQL table.
The issue I am now having is I have a field named "Comment" that datatype is Nvarchar(MAX) in my SQL table. This column does contain NULL values as well. When I am trying to load these back to the Excel column I am having an error.
[Excel Destination [28]] Error: An error occurred while setting up a binding for the "Comment" column. The binding status was "DT_NTEXT".
I thought perhaps I could do a Data Conversion to a string with the max character (Which is 757) but it truncates and errors on that size.
This data came from an excel column so I would think I can load it back to a column.
Thanks for the help!!
Previously I thought that Excel does not allow exporting data with longer than 255 characters. After running several experiments, exporting DT_NTEXT values to Excel can be done using SSIS:
You should create an Excel file with one dummy row that contains long text values (> 255) then use this Excel as a destination. If the Excel contains previous data, make sure to add this dummy row directly after the file header and add ;IMEX=1 to the OLE DB connectionstring.

SSIS - Excel data shows as scientific notations and Null Values

I have some excel data which contains scientific numbers like 5e+00.
When the see the value in excel by clicking edit button I can see the full value. But when I import the data into table I am getting the data loaded as Null. I need to import the data without doing any changes in excel. Please suggest how to do it in SSIS.
I tried imported by changing the format in excel side. I want it to be done in ssis level without doing any changes in excel
Data in my Column as
Amounts
15880
5e+19
57892
I expect the output should be like as follows
1588007
500000000019
57892
But I am getting Null value for second item
Please suggest.
In the question above, there are 2 problems:
Numbers are shown in scientific format
Data is replaced by Null values while importing
Scientific Format issue
You mentioned that:
I tried imported by changing the format in excel side. I want it to be done in SSIS level without doing any changes in excel
Unfortunately, this cannot be done without changing the Excel file, since the only way to solve this issue is to change the Number Format property of the cells. You can automate this step by adding a Script Task that uses Microsoft.Office.Interop.Excel.dll assembly to automate this process instead of doing it manually from Excel.
You can refer to the following post as an example:
Format excel destination column in ssis script task
But make sure to use:
m_XlWrkSheet.Columns(1).NumberFormat = "0"
To force a Numeric format.
Null Values issue
This issue is caused by the OLE DB provider used to read from Excel files, This error occurs when the Excel column contains mixed data types, the OLE DB provider read the values with dominant data types and replace all other values with Nulls.
You can refer to the following links for more information/workarounds:
Importing Excel Data Seems to Randomly Give Null Values
SQL JOIN on varchar with special characters and leading zeros
Dynamically Creating Excel table through SSIS

Importing Excel Data Seems to Randomly Give Null Values

Using SSIS for Visual Studio 2017 for some excel file imports.
I've created a package with several loop containers that call to specific packages to handle some files. I have an issue with one particular package being executed in that it seemingly randomly decides the data for columns is NULL per excel file. I was/am under the impression that this is part of the registry setting for TypeGuessRows (changed initially to 0 then to 1000 as a test) located at
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\14.0\Access Connectivity Engine\Engines\Excel
The reason I think this is because the various files being brought in generally have the same data, but it seems that if the first few rows of columns in the source data contains only numbers, that the data with mixed values will not be brought in correctly. All other columns aside from this seems fine.
Looking at the source files, all have the same datatype.
I've tried changing the registry TypeGuessRows value and ensured that the output column property was string-based instead of numerical.
The connection string has IMEX=1
So I fixed it. Or at least found a sufficient workaround that should help anyone in my situation. I think it has to do with the cache of SSIS.
I ended up putting a sort function on the problem column so the records getting read as NULL for having a random data type are read first, and not being considered random. I will say, I tried this initially and it didn't work.
Through a little experiment of making a new data flow in the same package I discovered that this solution actually does work, hence me thinking the cache was the issue.
If anyone has any further questions on this, let me know.
This issue is related to the OLEDB provider used to read excel files: Since excel is not a database where each column has a specific data type, OLEDB provider tries to identify the dominant data types found in each column and replace all other data types that cannot be parsed with NULLs.
There are many articles found online discussing this issue and giving several workarounds (links listed below).
But after using SSIS for years, i can say that best practice is to convert excel files to csv files and read them using Flat File components.
Or, if you don't have the choice to convert excel to flat files then you can force excel connection manager to ignore headers from the first row bu adding HDR=NO to the connection string and adding IMEX=1 to tell the OLEDB provider to specify data types from the first row (which is the header - all string most of the time), in this case all columns are imported as string and no values are replaced with NULLs but you will lose the headers and a additional row (header row is imported).
If you cannot ignore the header row, just add a dummy row that contains dummy string values (example: aaa) after the header row and add IMEX=1 to the connection string.
Helpful links
SSIS Excel Data Import - Mixed data type in Rows
Mixed data types in Excel column
Importing data from Excel having Mixed Data Types in a column (SSIS)
Why SSIS always gets Excel data types wrong, and how to fix it!
EXCEL IN SSIS: FIXING THE WRONG DATA TYPES
IMEX= 1 extended properties in ssis

Finding the column names from source assistant in SSIS

I am creating a SSIS package in which i have to move data from Excel to a table in SQL server. Excel file is like Source Assistant in data flow task.
Number columns in Excel file won't change but column names will change. So i have to find all the columns names in Excel file before inserting data.
Could you please help me on this?
Solution overview
Exclude column names in first row in excel connection, use sql command as data access mode
Alias column names in output column as matching your destination
Add a script task before the data flow task that import the data
You have to use the script task to open the excel file and get the Worksheet name and the header row
Build the Query and store it in a variable
in the second Data Flow task you have to use the query stored above as source (Note that you have to set Delay Validation property to true)
Detailed Solution
You can follow my answer at Importing excel files having variable headers it is solving a very similar case.

SSIS Package download from Excel

Using SSIS I am downloading data from an Excel file into a MS SQL database. In this Excel file, there's a column with data looking like 99,99, which goes into a column of type float in the database. After downloading to the database, the data in this column has ignored the comma and it has become 9999.
Any suggestions on how to correctly download the data into my database and keep the comma?
Make sure the locale settings on the SSIS server are consistent with those of the Excel. Specifically look at the thousand separator and the decimal separator. If SSIS thinks the comma is a thousand-separator, it is ignored.
You could also make a Derived Column Transformation, in which you import this column from Excel as text, change the ,into a .using a string replace function, and then cast it to float.

Resources