I have a package in SSIS, in the data flow I have a script component that is a data source, setting variables to an Excel Data Flow Destination.
With ScriptOutputBuffer
.AddRow()
.scriptRowID = 2
.Filename = Variables.foundFile
.RunTime = Now
.Status = "found"
End With
Unfortunately the dates are coming out in the following format :-
7/2/2010 8:22:46 AM
The LocaleID on the script component and the Excel Destination Data Flow component are all set to English (United Kingdom)
Any help, or other things to check would be much appreciated.
EDIT: The script component was outputting the date in the DT_Date format. I changed it to DT_DBTIMESTAMP format and I am now getting the dates in the following format in Excel :-
2010-07-02 09:15:44.662000000
better, but still a little unfriendly when read by humans.
I think this needs to be solved in Excel - typically it stores datatime in locale-independent form (just like SQL Server) and displays according to Excel preferences or the formatting settings in the cell itself.
You could format it in the source and put it in a string data type - you would still require Excel to interpret your months and days the right way around...
Related
Every time that I try to import an Excel file into SQL Server I'm getting a particular error. When I try to edit the mappings the default value for all numerical fields is float. None of the fields in my table have decimals in them and they aren't a money data type. They're only 8 digit numbers. However, since I don't want my primary key stored as a float when it's an int, how can I fix this? It gives me a truncation error of some sort, I'll post a screen cap if needed. Is this a common problem?
It should be noted that I cannot import Excel 2007 files (I think I've found the remedy to this), but even when I try to import .xls files every value that contains numerals is automatically imported as a float and when I try to change it I get an error.
http://imgur.com/4204g
SSIS doesn't implicitly convert data types, so you need to do it explicitly. The Excel connection manager can only handle a few data types and it tries to make a best guess based on the first few rows of the file. This is fully documented in the SSIS documentation.
You have several options:
Change your destination data type to float
Load to a 'staging' table with data type float using the Import Wizard and then INSERT into the real destination table using CAST or CONVERT to convert the data
Create an SSIS package and use the Data Conversion transformation to convert the data
You might also want to note the comments in the Import Wizard documentation about data type mappings.
Going off of what Derloopkat said, which still can fail on conversion (no offense Derloopkat) because Excel is terrible at this:
Paste from excel into Notepad and save as normal (.txt file).
From within excel, open said .txt file.
Select next as it is obviously tab delimited.
Select "none" for text qualifier, then next again.
Select the first row, hold shift, select the last row, and select the text radial button. Click Finish
It will open, check it to make sure it's accurate and then save as an excel file.
There is a workaround.
Import excel sheet with numbers as float (default).
After importing, Goto Table-Design
Change DataType of the column from Float to Int or Bigint
Save Changes
Change DataType of the column from Bigint to any Text Type (Varchar, nvarchar, text, ntext etc)
Save Changes.
That's it.
When Excel finds mixed data types in same column it guesses what is the right format for the column (the majority of the values determines the type of the column) and dismisses all other values by inserting NULLs. But Excel does it far badly (e.g. if a column is considered text and Excel finds a number then decides that the number is a mistake and insert a NULL instead, or if some cells containing numbers are "text" formatted, one may get NULL values into an integer column of the database).
Solution:
Create a new excel sheet with the name of the columns in the first row
Format the columns as text
Paste the rows without format (use CVS format or copy/paste in Notepad to get only text)
Note that formatting the columns on an existing Excel sheet is not enough.
There seems to be a really easy solution when dealing with data type issues.
Basically, at the end of Excel connection string, add ;IMEX=1;"
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\\YOURSERVER\shared\Client Projects\FOLDER\Data\FILE.xls;Extended Properties="EXCEL 8.0;HDR=YES;IMEX=1";
This will resolve data type issues such as columns where values are mixed with text and numbers.
To get to connection property, right click on Excel connection manager below control flow and hit properties. It'll be to the right under solution explorer. Hope that helps.
To avoid float type field in a simple way:
Open your excel sheet..
Insert blank row after header row and type (any text) in all cells.
Mouse Right-Click on the head of the columns that cause a float issue and select (Format Cells), then choose the category (Text) and press OK.
And then export the excel sheet to your SQL server.
This simple way worked with me.
A workaround to consider in a pinch:
save a copy of the excel file, modify the column to format type 'text'
copy the column values and paste to a text editor, save the file (call it tmp.txt).
modify the data in the text file to start and end with a character so that the SQL Server import mechanism will recognize as text. If you have a fancy editor, use included tools. I use awk in cygwin on my windows laptop. For example, I start end end the column value with a single quote, like "$ awk '{print "\x27"$1"\x27"}' ./tmp.txt > ./tmp2.txt"
copy and paste the data from tmp2.txt over top of the necessary column in the excel file, and save the excel file
run the sql server import for your modified excel file... be sure to double check the data type chosen by the importer is not numeric... if it is, repeat the above steps with a different set of characters
The data in the database will have the quotes once the import is done... you can update the data later on to remove the quotes, or use the "replace" function in your read query, such as "replace([dbo].[MyTable].[MyColumn], '''', '')"
I have an annoying issue working with SQL Server DATETIME objects in Excel 2013. The problem has been stated several times here in SO, and I know that the work around is to just reformat the DATETIME objects in Excel by doing this:
Right click the cell
Choose Format Cells
Choose Custom
In the Type: input field enter yyyy-mm-dd hh:mm:ss.000
This works fine BUT I loathe having to do this every time. Is there a permanent work around to this aside from creating macros? I need to maintain the granularity of the DATETIME object so I cannot use a SMALLDATETIME. I am currently using Microsoft SQL Server Management Studio 2008 r2 on a win7 machine.
Thanks in advance.
-Stelio K.
Without any code it's hard to guess how the data gets from SQL Server to Excel. I assume it's not through a data connection, because Excel wouldn't have any issues displaying the data as dates directly.
What about data connections?
Excel doesn't support any kind of formatting or any useful designer for that matter, when working with data connections only. That functionality is provided by Power Query or the PivotTable designer. Power Query is integrated in Excel 2016 and available as a download for Excel 2010+.
Why you need to format dates
Excel doesn't preserve type information. Everything is a string or number and its display is governed by the cell's format.
Dates are stored as decimals using the OLE Automation format - the integral part is the number of dates since 1900-01-01 and the fractional part is the time. This is why the System.DateTime has those FromOADate and ToOADate functions.
To create an Excel sheet with dates, you should set the cell format at the same time you generate the cell.
How to format cells
Doing this is relatively if you use the Open XML SDK or a library like EPPlus. The following example creates an Excel sheet from a list of customers:
static void Main(string[] args)
{
var customers = new[]
{
new Customer("A",DateTime.Now),
new Customer("B",DateTime.Today.AddDays(-1))
};
File.Delete("customers.xlsx");
var newFile = new FileInfo(#"customers.xlsx");
using (ExcelPackage pck = new ExcelPackage(newFile))
{
var ws = pck.Workbook.Worksheets.Add("Content");
// This format string *is* affected by the user locale!
// and so is "mm-dd-yy"!
ws.Column(2).Style.Numberformat.Format = "m/d/yy h:mm";
//That's all it needs to load the data
ws.Cells.LoadFromCollection(customers,true);
pck.Save();
}
}
The code uses the LoadFromCollection method to load a list of customers directly, without dealing with cells. true means that a header is generated.
There are equivalent methods to load data from other source: LoadFromDatatable, LoadFromDataReader, LoadFromText for CSV data and even LoadFromArrays for jagged object arrays.
The weird thing is that specifying the m/d/yy h:mm or mm-dd-yy format uses the user's locale for formatting, not the US format! That's because these formats are built-in into Excel and are treated as the locale-dependent formats. In the list of date formats they are shown with an asterisk, meaning they are affected by the user's locale.
The reason for this weirdness is that when Excel moved to the XML-based XLSX format 10 years ago, it preserved the quirks of the older XLS format for backward-compatibility reasons.
When EPPlus saves the xlsx file it detects them and stores a reference to the built-in format ID (22 and 14 respectively) instead of storing the entire format string.
Finding Format IDs
The list of standard format IDs is shown in the NumberingFormat element documentation page of the Open XML standard. Excel originally defined IDs 0 (General) through 49.
EPPlus doesn't allow setting the ID directly. It checks the format string and maps only the formats 0-49 as shown in the GetBfromBuildIdFromFormat method of ExcelNumberFormat. In order to get ID 22 we need to set the Format property to "m/d/yy h:mm"
Another trick is to check the stylesheets of an existing sheet. xlsx is a zipped package of XML files that can be opened with any decompression utility. The styles are stored in the xl\styles.xml file.
I am calling a stored procedure from a data flow task in SSIS in which I am selecting the HOUR datepart of a datetime field. (code below from the stored procedure)
SELECT
DATEPART (HOUR, R.received_date) AS [Hour] ,
CONVERT (VARCHAR(10), R.received_date, 101) AS [Date] ,
COUNT (R.id) AS [NumberofFilings]
And in my data flow task, I have a OLE DB Source task in which I call the stored procedure:
And when I preview the data with the OLE DB source task, the data looks like I would expect - with the hour column displaying an integer between 0 & 24:
The issue occurs after I export the results to a CSV file and the hour becomes a datetime field where the values become '1/11/1900 0:00' which is not what I'm expecting.
In my flat file destination connection manager, I set the Hour properties to be four-byte signed integer but the hour will not display as an integer but as a datetime.
I've tried other datatypes for the Hour column but nothing will convert this to a single integer / character. Any suggetions?
If you are opening the .csv file in Excel, I suspect that Excel is looking at a column named "Hour" and thinking, "Must be a datetime field. I'll just help my user out and make it so."
Try opening the .csv file in notepad and see what the actual contents look like.
EDIT:
I am unable to reproduce your results. When I follow your steps I get a CSV file that looks like this in notepad:
"Col1","Col2"
"0","04/05/2016"
"0","04/02/2016"
"0","04/01/2016"
...
You must be doing something that you are not including in your description of the issue.
Or maybe your package has gotten corrupted. You could try re-building it from scratch to eliminate that possibility.
But I have tested and proved that what you are trying to do should work.
I am exporting a file that is going to be picked up by another system. To avoid rework in the other system I am trying to match an existing excel csv output exactly. I have a date column in the DB which I want to export as dd/mm/yyy. In the data flow task I have the following SQL as the source where I do the appropriate conversion. If I run this query in ssms I get the right output.
SELECT [Code]
,[Agency_Name]
,[Region_Group]
,CONVERT( varchar(20), [GrossAmtYrly] , 1) GrossAmtYrly
,CONVERT ( varchar(20), [SaleDate] , 103) SaleDate
,[MemberNo]
,[Surname]
,[Scale]
FROM [Land].[Sales]
I then link this to a flat file destination, the column that this is mapped to is set to DT_SR width 20 not text qualified.
But the output file is spitting out a date in format yyyy-mm-dd.
Similarly for the grossamtyrly the old excel generated csv had the amount with commas after each 3 digits, wrapped in ". The output column it is mapped to is DT_SR width 20 with text qualified to true.
The output file for that column is missing the commas for grossamtyrly.
So it seems like my conversions in the SQL are being ignored completely, but I can't work out why.
Thanks in advance for any suggestions!
Using SSIS 2012 - Visual Basic 2010, DB is SQL Server 2012
I'd use a derived column in the data flow to convert it to the format you want. If it's coming in as a text field in format yyyy-mm-dd, you can convert it to dd/mm/yyyy with the following expression:
SUBSTRING(dt,9,2) + "//" + SUBSTRING(dt,6,2) + "//" + SUBSTRING(dt,1,4)
Thanks Custodian, I figure out how to get it to work.
I double clicked on the flow arrow between the tasks and the metadata tab shows the data type of each column. When I first set this up I did access mode as table or view and so date and grossamt were set to DT_DATE and DT_CY, so I think SSIS was implictly converting the column back again to its original type.
Now I couldn't work out how to change them, So I deleted the DB Source and recreated it starting with the SQL Command option, and everything works as expected.
I'm using the Import/Export Wizard to import some data to a table. Total of 2 rows, so I've just been working around this, but I would like to know the answer.
The issue with the Import/Export is the dates. No matter what I do, they fail. The date looks pretty straightforward to me: 2009-12-05 11:40:00. I also tried: 2010-03-01 12:00 PM. Tried DT_DATE and DT_DBTIMESTAMP as a source data type. The target column type is datetime.
The message that I get is:
The data conversion for column
"Start_Date" returned status value 2
and status text "The value could not
be converted because of a potential
loss of data.".
How do I fix this? Why's the Import/Export Wizard so bad at parsing dates (or is that in my imagination)?
The truly obnoxious thing here is that when you select a date column from a table and save it as a CSV you get a date like '2009-12-05 11:40 AM'. So the import wizard isn't even capable of parsing dates that come from SQL Server. Really? Really?
Added details (realized my description wasn't correct after revisiting the package I had issues with):
The import thing IS pretty bad.
In my case I had incoming data with form matching SQL Server type 126 / ISO8601. That is, in T-SQL, this form:
select convert ( varchar(100), getdate(), 126 )
--> 2009-12-22T16:29:22.123
I was able to import with SSIS using two steps:
Replace the "T" with a space " ", using SSIS Derived Column with expression:
REPLACE(DateColumn,"T"," ")
Cast the result to database timestamp [DT_DBTIMESTAMP] using the data conversion transform
Apologies if I caused any confusion.