SSIS Error importing Excel Date (truncation error) - sql-server

I am sorry to post what seems a very simple issue but I cannot find an answer and I am wasting days (not just hours at this point). I am fairly new to SSIS and it is just kicking my backside.
Background:
Pretty straightforward SSIS Package to import an Excel sheet into a Staging table in SQL Server. Since I do not want to mislead anyone by using the wrong nomenclature, I will refer to the Excel source as Excel and the SQL Server table as the Target Table.
This package HAS worked before. However, it is now failing because of data truncation for a Date Column. The Excel Column has been formatted as a DATE (and I have tried a few different format options within DATE). The target column is also a DATE column (NOT datetime). The data in Excel is predominantly empty cells with a few sporadic values. I think the errors started when the dates started appearing in the data (rather than just blanks).
I have tried using the Advanced Editor for both sides (Excel & Target) and tried numerous data type settings all around but I keep getting the same failure. I suspect that it is now pretty messed up with the various tests that I have done.
I have also tried adding a Data Conversion transform for the Date field “date[DT_DATE]” – that did not work. AND, I have tried creating a Derived Column - first based on the Excel column and then on the Transformed column. All of those attempts have failed.
Questions:
1) What is the best practice for importing Excel data into SQL Server for DATE Columns?
2) Since this is two very mature Microsoft Apps (Excel & SQL Server) working together, it seems like it should be simple. This leads me to believe that I must be missing some basic concepts here. Can anyone set me straight?
3) How do all of you get an Excel date into SQL Server?
4) What is the trick for synchronizing columns after making edits?
Thanks for any insights you can provide. Sorry to bother you all with what seems pretty simple.
David

Personally I don't think there is a best practice for excel dates, it is always a pain for me.
If you can format the excel file try changing it to 'Text'. it will import as Unicode and not a date. If not, try and convert the column in a "Data Conversion
" task to Unicode
after that is done, you would need to use a "Derived Column" task. Build the date in the format you want.
example for source MM/dd/yyyy hh:mm:ss
Build to be yyyy-MM-dd
SUBSTRING(datecolumn,7,4)+ "-" + SUBSTRING(datecolumn,1,2)+ "-" +SUBSTRING(datecolumn,4,2)
Might be crude, but saves my sanity.
If the date looks something like m/d/yyyy not including 2 values when Jan or something, you will add a few things like this for the month part.
RIGHT("0" + SUBSTRING(datecolumn,1,FINDSTRING(datecolumn,"/"1)-1),2)
Good luck

The main problem when importing Data from excel worksheets is that excel is that each column in excel can have multiple data types or formats, so the same column can contains Dates and Numbers and text or dates with differents formats (Some formats cannot be converted implicitly to dates in SSIS).
If all date value are stored as date (not Text), The best practice to Import dates from an excel worksheet is to convert DATE to Number format "0.000000000" (in excel it is called Serial DateTime) from excel or programmatically using a library like Microsoft.Office.Interop.Excel
You can refer to this Link but use the following:
xlCells.NumberFormat = "0.0000000"
Then in SSIS package use a script component to convert it again to Date using DateTime.FromOADate() Function
*Assuming that inColumn is the Date column with a numeric type, add an output column outColumn of type DT_DBTIMESTAMP or DT_DATE and use the following code:
If Not Row.inColumn_IsNull Then
Row.OutColumn = DateTime.FromOADate(CDbl(Row.inColumn))
Else
Row.OutColumn_IsNull = True
End If
Note: When converting column to Number Format, you ignored all formats but still have the date value
To read more about DateTimes in Excel you can refer to this Link
To read more about Date time formats that can be implicitly converted to date in SSIS follow SSIS Source Format Implicit Conversion for Datetime

Related

Using SSIS and T-SQL to convert date to dd.mm.yyyy

When I use T-SQL to convert a datetime into dd.mm.yyyy for an csv output using SSIS, the file is produced with a dd-mm-yyyy hh:mm:ss which is not what i need.
I am using:
convert(varchar,dbo.[RE-TENANCY].[TNCY-START],104)
which appears correct in SSMS.
Which is the best way to handle the conversion to be output from SSIS?
Not as simple as i thought it would be.
It works for me.
Using your query as a framework for driving the package
SELECT
CONVERT(char(10),CURRENT_TIMESTAMP,104) AS DayMonthYearDate
I explicitly declared a length for our dd.mm.yyyy value and since it's always going to be 10 characters, let's use a data type that reflects that.
Run the query, you can see it correctly produces 13.02.2019
In SSIS, I added an OLE DB Source to the data flow and pasted in my query
I wired up a flat file destination and ran the package. As expected, the string that was generated by the query entered the data flow and landed in the output file as expected.
If you're experiencing otherwise, the first place I'd check is double clicking the line between your source and the next component and choose Metadata. Look at what is reported for the tenancy start column. If it doesn't indicate dt_str/dt_wstr then SSIS thinks the data type is date variant and is applying locale specific rules to the format. You might also need to check how the column is defined in the flat file connection manager.
The most precise control on output format of the date can be achieved by T-SQL FORMAT(). It is available since SQL Server 2012.
It is slightly slower than CONVERT() but gives desired flexibility
An example:
SELECT TOP 4
name,
FORMAT(create_date, 'dd.MM.yyyy') AS create_date
FROM sys.databases;
name create_date
--------------------
master 08.04.2003
tempdb 12.02.2019
model 08.04.2003
msdb 30.04.2016
p.s. take into account that FORMAT() produces NVARCHAR output, which is different from your initial conversation logic, therefore extra cast to VARCHAR(10)) perhaps will be necessary to apply

Date format in Excel file to load to SQL Server

Our business would be providing us a .csv file. One of the columns in the file would be in date format. Now as we know there are many date formats in Excel. The problem is that we need to check whether the date provided is a correct date. It could be in any format like ddmmyyyy, yyyymmdd, dd-mon-yyyy etc basically any format that Excel supports.
We are planning to first load the data in a staging area and the date field would be defined as varchar so that it can accept any data.
Now either using SSIS or via T-SQL, I need to check whether the date provided is actually a date and if it is I need to load it into a different table in YYYYMMDD format.
How do I go about doing the above?
Considering you have your excel data already loaded into a SQL Server table as varchar (you can easily do this using SSIS), something like this would work:
SELECT
case when ISDATE(YOUR_DATE) = 1 then CONVERT(int,YOUR_DATE,112) else null end as MyDate
FROM
YOUR_TABLE
I don't have access to a SQL Server instance at the moment and can't test the code above, so you may need to adapt to your needs, but this is the general idea.
You can also do further research on ISDATE and CONVERT functions in SQL Server. You should be able to achieve what you need combining them together.

EXCEL datetime row changed to "42507" after import into SQL Server

I got a problem, I have imported this Excel sheet into SQL Server several times, before it worked fine.
But suddenly there are 2 rows (datetime) with invalid data. In Excel, the datetime row has been all changed to 2016/12/12
But when the data is imported into SQL Server, some will change to sort of 42507 format, and couldn't calculate using datediff.
I was quite confused of this, can anyone help? Any of your idea is greatly appreciated.
Thanks in advance.
Excel stores dates as integers, the number of days since 1899-12-30, you can use =TEXT(A1,”yyyy-mm-dd hh:MM:ss”) in Excel to store the text value for easy import, but if you already have the integers in SQL you can use DATEADD(day,yourDate,'1899-12-30') to convert it to the proper date.
.xlsx (and .docx,pptx etc) files are just archives and the meat of your documents are stored in xml files. You can change the extension to .zip and open the archive to explore how the data is actually stored, in most if not all cases, cell formatting doesn't affect the underlying values.
Make sure that field in Excel is set to date and then import the sheet.
You can also cast the 5 digit as datetime:
UPDATE <yourTable>
SET <dateColumn> = CAST(<dateColumn> as datetime)
WHERE LEN(<dateColumn>) = 5

SQL Server 2014 SSIS Excel Source Task import to table date and text to numeric conversion issues

Using SQL Server 2014 SSIS to import an vendor supplied Excel file through the Excel Source Data Flow. Two issues I'm having related to data conversion to the SQL table.
In the file is a text column that has prices (numeric values) in it I can't not get it to transform into a numeric field (decimal(8,2)) in SQL. I have used the Data Conversion data flow task converting it to DT_NUMERIC and it fails to process the field. I have also tried to let it go through the Data Conversion task and converted through a Derived Column casting the field to Numeric. Both fail, I'm at a loss as to how to get this into the database in a Decimal/Numeric format.
In the same file are three date fields with dates that look like 07/18/2015 in Excel. I have tried similarly with the Data Conversion and Derived column to get the date into the database as SQL date formats. I have cast the dates at DT_DBDATE and DT_DBDATE and DT_DBTIMESTANP and neither has worked I have also tried taking the month day and year and rearranging them into the SQL date format with Substring/left/right functions to split the string. Also to no avail.
Here is what I tried:
Excel Source ---> Data Conversion ----> Derived Column -----> OLE DB Destination
In the excel source it recognized the date as text, I leave that be in the data conversion to deal with it in the Derived Column where I have tried.
a. (DT_DBDATE)("20" + RIGHT(TRIM(sale_start),2) + "-" + LEFT(TRIM(sale_start),2) + "-" + SUBSTRING(TRIM(sale_start),4,2)) - I have done this with and without the trim with same results. I have also used Right(sale_start,4).
b. (DT_DBDATE) sale_start
The SQL table is data type DATE. I have also changed it to DATETIME and used DT_DBTIMESTAMP in place of DT_DBDATE above.
I can't change the file I'm receiving it needs to process into the database the way it comes from the vendor. Looking at the data in excel there seems to be no reason it wouldn't be ok.
Any direction on bringing this data in would be much appreciated.
2.
I was able to figure this out although I don't completely understand what the connection setting is doing. Similarly with a XML file this connection setting wasn't necessary although some version of a derived column was, I the above I believe in my XML import.
For the EXCEL Solution:
1) In the Excel File connection I added IMEX=1 to the end of the connection under properties. So the connection string looked like this:
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\SSIS\Test.xls;Extended Properties="EXCEL 8.0;HDR=YES;IMEX=1";
2) Used the following script in the derived column:
ISNULL([Copy of expected_date]) ? NULL(DT_DATE) : (LEN(TRIM([Copy of expected_date])) == 0 ? NULL(DT_DATE) : (DT_DATE)((DT_DBDATE)TRIM([Copy of expected_date])))
Thanks for taking the time to respond.

Import into SQL Server 2003 from a CSV file Error

I've spent so many hours just trying to import CSV, Excel data into SQL Server 2003 using SSMS (2012). I've tried importing as an excel, as a CSV, and a text file all options have presented problems of their own.
The biggest frustration now is when importing a CSV under the Flatfile option. In the Advanced tab I've set my source date column to have [DT_DBTIMESTAMP] this matches my destination's date column's type [DT_DBTIMESTAMP] YET despite all this when I run my import SQL Server errors out and says
• Error 0xc02020a1: Data Flow Task 1: Data conversion failed. The data conversion for column returned status value 2 and status text "The value could not be converted because of a potential loss of data.".
(SQL Server Import and Export Wizard).
How can this fail if BOTH columns are exactly the same type?
Thanks in advance for the help.
It might not be that SQLServer is complaining about losing data when importing but when reading. For instance, the DT_DBTIMESTAMP type expects the format yyyy-mm-dd hh:mm:ss[.fff]. If your data is not formatted correct in any dimension (say your mm > 12 or dd > 31) that may be the problem. Like #Pondlife suggested I'd bring everything into a varchar(max) field and then run TSQL to see if all rows can convert to DBTIMESTAMP data type is a simple convert statement.
I saw many problems with SQL server import wizard (I'm on version 2008 R2), reading date fields from CSV files, when format is Latin - day comes before the month, like 25/12/2013 (Christmas). Looks like it assume allways MM/DD/YYYY, and there is no clear way to tell to read as DD/MM/YYYY (or I did not find). If I solve, will post. Tks.

Resources