I’m trying to load data from Excel to table using SSIS but it’s not getting loaded properly. Example: my column1 contains data as below:
Column 1
1
1.1
1.1.1
1.2
1.3
1.4
Output coming in sql server as below;
Column 1
1
1.10000000000001
Null
1.2
1.3
1.39999999999999
Expected output should be same as source. I have tried this with string data type. But still not working.
SSIS implements its data types while taking data from various sources, operating it, and exporting it to one of many destinations.
If you use data that contains mixed data types, by default, the Excel driver reads the first 8 rows (configured by the TypeGuessRows register key).
Based on the first 8 rows of data, the Excel driver tries to guess the data type of each column.
In your case,if your Excel data source has numbers (float) and text in one column, if the first 8 rows contain numbers (float), the driver might determine based on those first 8 rows that the data in the column is the float type. In this case, SSIS skips text values and imports them as NULL into the destination.
If you enable the data viewer you will understand how it works :
In your case you can use a data conversion component
Related
I have some excel data which contains scientific numbers like 5e+00.
When the see the value in excel by clicking edit button I can see the full value. But when I import the data into table I am getting the data loaded as Null. I need to import the data without doing any changes in excel. Please suggest how to do it in SSIS.
I tried imported by changing the format in excel side. I want it to be done in ssis level without doing any changes in excel
Data in my Column as
Amounts
15880
5e+19
57892
I expect the output should be like as follows
1588007
500000000019
57892
But I am getting Null value for second item
Please suggest.
In the question above, there are 2 problems:
Numbers are shown in scientific format
Data is replaced by Null values while importing
Scientific Format issue
You mentioned that:
I tried imported by changing the format in excel side. I want it to be done in SSIS level without doing any changes in excel
Unfortunately, this cannot be done without changing the Excel file, since the only way to solve this issue is to change the Number Format property of the cells. You can automate this step by adding a Script Task that uses Microsoft.Office.Interop.Excel.dll assembly to automate this process instead of doing it manually from Excel.
You can refer to the following post as an example:
Format excel destination column in ssis script task
But make sure to use:
m_XlWrkSheet.Columns(1).NumberFormat = "0"
To force a Numeric format.
Null Values issue
This issue is caused by the OLE DB provider used to read from Excel files, This error occurs when the Excel column contains mixed data types, the OLE DB provider read the values with dominant data types and replace all other values with Nulls.
You can refer to the following links for more information/workarounds:
Importing Excel Data Seems to Randomly Give Null Values
SQL JOIN on varchar with special characters and leading zeros
Dynamically Creating Excel table through SSIS
Using SSIS for Visual Studio 2017 for some excel file imports.
I've created a package with several loop containers that call to specific packages to handle some files. I have an issue with one particular package being executed in that it seemingly randomly decides the data for columns is NULL per excel file. I was/am under the impression that this is part of the registry setting for TypeGuessRows (changed initially to 0 then to 1000 as a test) located at
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\14.0\Access Connectivity Engine\Engines\Excel
The reason I think this is because the various files being brought in generally have the same data, but it seems that if the first few rows of columns in the source data contains only numbers, that the data with mixed values will not be brought in correctly. All other columns aside from this seems fine.
Looking at the source files, all have the same datatype.
I've tried changing the registry TypeGuessRows value and ensured that the output column property was string-based instead of numerical.
The connection string has IMEX=1
So I fixed it. Or at least found a sufficient workaround that should help anyone in my situation. I think it has to do with the cache of SSIS.
I ended up putting a sort function on the problem column so the records getting read as NULL for having a random data type are read first, and not being considered random. I will say, I tried this initially and it didn't work.
Through a little experiment of making a new data flow in the same package I discovered that this solution actually does work, hence me thinking the cache was the issue.
If anyone has any further questions on this, let me know.
This issue is related to the OLEDB provider used to read excel files: Since excel is not a database where each column has a specific data type, OLEDB provider tries to identify the dominant data types found in each column and replace all other data types that cannot be parsed with NULLs.
There are many articles found online discussing this issue and giving several workarounds (links listed below).
But after using SSIS for years, i can say that best practice is to convert excel files to csv files and read them using Flat File components.
Or, if you don't have the choice to convert excel to flat files then you can force excel connection manager to ignore headers from the first row bu adding HDR=NO to the connection string and adding IMEX=1 to tell the OLEDB provider to specify data types from the first row (which is the header - all string most of the time), in this case all columns are imported as string and no values are replaced with NULLs but you will lose the headers and a additional row (header row is imported).
If you cannot ignore the header row, just add a dummy row that contains dummy string values (example: aaa) after the header row and add IMEX=1 to the connection string.
Helpful links
SSIS Excel Data Import - Mixed data type in Rows
Mixed data types in Excel column
Importing data from Excel having Mixed Data Types in a column (SSIS)
Why SSIS always gets Excel data types wrong, and how to fix it!
EXCEL IN SSIS: FIXING THE WRONG DATA TYPES
IMEX= 1 extended properties in ssis
I am trying to import "Financial data" from Excel files in to sql table. Problem I am facing is that My ssis package is incrementing decimal values. e.g -(175.20) from Excel is being loaded as "-175.20000000000005" in SQL.
I am using nVArChar (20) in destination SQL table. Images attached. What's the best data type in destination table. I have done a lot of reading and people seem to suggest decimal data but Package throws error for Decimal data type.Need help please.
Ended up changing the Data type to "Currency" in my SQL destination. Then added a data conversion task to change "DT_R8" data type from excel source to "currency[DT_CY]. This resolved the issue. could have used decimal or Numeric (16,2)data type in my destination as well but then i just went ahead with currency and it worked.
You could use a Derived Column Transformation in your Data Flow Task, with an expression like ROUND([GM],2) (you might need to replace GM with whatever your actual column name is).
You can then go to the Advanced Editor of the Derived Column Transformation and set the data type to decimal with a Scale of 2 on the 'Input and Output Properties' tab (look under 'Derived Column Output').
You'll then be able to use a decimal data type in your SQL Server table.
I'm using SQL 2008 R2 & Excel of format xlsx. I'm trying to export data to Excel through XML source.Everything is working fine, than for columns where data is more than 255 characters. I get a warning:
truncation may happen due to inserting data from data flow column with
a length of 4000 to database column with length 255
I've tried changing registry and IMEX value as per the blog. This doesn't seem to help me. Also, for me connection string (Excel Connection Manager) is
Provider=Microsoft.ACE.OLEDB.12.0;Data Source=Extract_8202014.xlsx;Extended
Properties="EXCEL 12.0 XML;HDR=YES";
As per MDSN for memo type columns we should define datatype as longtext if table is getting created at run time. I've done same. But it's not helping.
I have a flat file which is imported into SQL Server via an existing SSIS package. I need to make a change to the package to accommodate a new field in the flat file. The new field is a date field which is in the format dd-mmm-yy (e.g. 25-AUG-11). The date field in the flat file will either be empty (e.g. a space/whitespace) or populated with a date. I don’t have any control over the date format in the flat file.
I need to import the date field in the flat file into an existing SQL Server table and the target field data type is smalldatetime.
I was proposing to import the date as a string into a load table and then convert to smalldatetime when taking the data from the load table. But is there another possible way to parse the date format dd-mmm-yy to load this straight into a smalldatetime field without having to use convert to smalldatetime from the load table. I can’t quite think how to parse the date format, particularly the month. Any suggestions welcome.
Here is an example that might give you an idea of what you can do. Ideally, in an SSIS package or in any ETL job, you should take into account that data may not be exactly what you would like it to be. You need to take appropriate steps to handle the incorrect or invalid data that might pop up now and then. That's why SSIS comes up with lots of Transformation tasks within Data Flow Task which you can make use of to clean up the data.
In your case, you can make use of Derived Column transformation or Data conversion transformation to achieve your requirements.
The example was created in SSIS 2008 R2. It shows how to read a flat file containing the dates and load into an SQL table.
I created a simple SQL table to import the flat file data.
On the SSIS package, I have a connection manager to SQL and one for Flat file. Flat file connection is configured as shown below.
On the SSIS package, I placed a Data Flow Task on the Control Flow tab. Inside, the Data Flow task, I have a Flat File Source, Derived Column transformation and an OLE DB Destination. Since the Flat file source and OLE DB destination are straightforward, I will leave those out here. The Derived transformation creates a new column with the expression (DT_DBDATE)SmallDate. Note that you can also use Data Conversion transformation to do the same. This new column SmallDateTimeValue should be mapped to the database column in OLE DB Destination.
If you execute this package, it will fail because not all the values in the file are valid.
The reason why it fails in your case is because the invalid data is directly inserted into the table. In your case, the table will throw an exception making the package to fail. In this example, the package fails because the default setting on the Derived column transformation is to fail the component if there is any error. So, let's place a dummy transformation to redirect the error rows. We will Multicast transformation for this purpose. It won't really do anything. Ideally, you should redirect the error rows to another table using OLE DB Destination or other Destination component of your choice so you can analyze the data that causes the errors.
Drag the red arrow from Derived transformation and connect it to the Multicast transformation. This will popup the Configure Error Output dialog. Change the values under the column Error and Truncation from Fail component to Redirect row. This will redirect any error rows to the Multicast transformation and will not get into the tables.
Now, if we execute the package, it will run successfully. Note the number of rows displayed in each direction.
Here is the data that got into the table. Only 2 rows were valid. You can look at the first screenshot that showed the data in the file and you can see only 2 rows were valid.
Hope that gives you an idea to implement your requirement in the SSIS package.
It should load straight into a SMALLDATETIME field as it is. Remember, dates are just numbers in SQL Server, which are presented to the user in the desired date/time format. The SSIS package should read 25-AUG-2011 just fine as a date data type, and insert it into a SMALLDATETIME field without issues.
Was the package throwing an error or something?