Visual Studio 2015; Source: OLEDB (SQL2012); Dest: FlatFile (csv)
I have stored procedures that I need to pull data from, that I don't have control of, that have output of float data types. In SSMS the data looks something like 3.45985, but in SSIS the output csv file has the data like 3.45989999999 (The numbers are made up, but you get the point).
I do understand how float works, and understand that float is not a precise data type -- but I'm not in control of the stored procedure. Where I am struggling is how to work-around it. I've read that you can set the output to string in the flat file destination, but it doesn't seem to be working. I've also tried taking the output from the stored procedure and putting the results into a table variable and have had success formatting like I need. However some of the stored procedures already are doing some inserts into a temp table and when I try to store the results into a table variable I'm getting an error. So, that solution is really not working for me.
My business users are used to these being like 3.45985 in a legacy application. I'm re-writing these to all be done via SSIS to automate the extracts, and not have to support the legacy business line application.
You can use a Derived Column Transformation to round the number up and cast it as a decimal, with an expression such as:
(DT_DECIMAL, 5) ROUND([ColumnName], 5)
Related
I have some excel data which contains scientific numbers like 5e+00.
When the see the value in excel by clicking edit button I can see the full value. But when I import the data into table I am getting the data loaded as Null. I need to import the data without doing any changes in excel. Please suggest how to do it in SSIS.
I tried imported by changing the format in excel side. I want it to be done in ssis level without doing any changes in excel
Data in my Column as
Amounts
15880
5e+19
57892
I expect the output should be like as follows
1588007
500000000019
57892
But I am getting Null value for second item
Please suggest.
In the question above, there are 2 problems:
Numbers are shown in scientific format
Data is replaced by Null values while importing
Scientific Format issue
You mentioned that:
I tried imported by changing the format in excel side. I want it to be done in SSIS level without doing any changes in excel
Unfortunately, this cannot be done without changing the Excel file, since the only way to solve this issue is to change the Number Format property of the cells. You can automate this step by adding a Script Task that uses Microsoft.Office.Interop.Excel.dll assembly to automate this process instead of doing it manually from Excel.
You can refer to the following post as an example:
Format excel destination column in ssis script task
But make sure to use:
m_XlWrkSheet.Columns(1).NumberFormat = "0"
To force a Numeric format.
Null Values issue
This issue is caused by the OLE DB provider used to read from Excel files, This error occurs when the Excel column contains mixed data types, the OLE DB provider read the values with dominant data types and replace all other values with Nulls.
You can refer to the following links for more information/workarounds:
Importing Excel Data Seems to Randomly Give Null Values
SQL JOIN on varchar with special characters and leading zeros
Dynamically Creating Excel table through SSIS
When I use T-SQL to convert a datetime into dd.mm.yyyy for an csv output using SSIS, the file is produced with a dd-mm-yyyy hh:mm:ss which is not what i need.
I am using:
convert(varchar,dbo.[RE-TENANCY].[TNCY-START],104)
which appears correct in SSMS.
Which is the best way to handle the conversion to be output from SSIS?
Not as simple as i thought it would be.
It works for me.
Using your query as a framework for driving the package
SELECT
CONVERT(char(10),CURRENT_TIMESTAMP,104) AS DayMonthYearDate
I explicitly declared a length for our dd.mm.yyyy value and since it's always going to be 10 characters, let's use a data type that reflects that.
Run the query, you can see it correctly produces 13.02.2019
In SSIS, I added an OLE DB Source to the data flow and pasted in my query
I wired up a flat file destination and ran the package. As expected, the string that was generated by the query entered the data flow and landed in the output file as expected.
If you're experiencing otherwise, the first place I'd check is double clicking the line between your source and the next component and choose Metadata. Look at what is reported for the tenancy start column. If it doesn't indicate dt_str/dt_wstr then SSIS thinks the data type is date variant and is applying locale specific rules to the format. You might also need to check how the column is defined in the flat file connection manager.
The most precise control on output format of the date can be achieved by T-SQL FORMAT(). It is available since SQL Server 2012.
It is slightly slower than CONVERT() but gives desired flexibility
An example:
SELECT TOP 4
name,
FORMAT(create_date, 'dd.MM.yyyy') AS create_date
FROM sys.databases;
name create_date
--------------------
master 08.04.2003
tempdb 12.02.2019
model 08.04.2003
msdb 30.04.2016
p.s. take into account that FORMAT() produces NVARCHAR output, which is different from your initial conversation logic, therefore extra cast to VARCHAR(10)) perhaps will be necessary to apply
This may be a stupid question but I must ask since I see it a lot... I have inherited quite a few packages in which developers will use the the Data Conversion transformation shape when dumping flat files into their respective sql server tables. This is pretty straight forward however I always wonder why wouldn't the developer just specify the correct data types within the flat file connection and then do a straight load into the the table?
For example:
Typically I will see flat file connections with columns that are DT_STR and then converted into the correct type within the package ie: DT_STR of length 50 to DT_I4. However, if the staging table and the flat file are based on the same schema - why wouldn't you just specify the correct types (DT_I4) in the flat file connection? Is there any added benefit (performance, error handling) for using the data conversion task that I am not aware of?
This is a good question with not one right answer. Here is the strategy that I use:
If the data source is unreliable
i.e. sometimes int or date values are strings, like when you have the literal word 'null' instead of the value being blank. I would let the data source be treated as strings and deal with converting the data downstream.
This could mean just staging the data in a table and using the database to do conversions and loading from there. This pattern avoid the source component throwing errors which is always tricky to troubleshoot. Also, it avoids having to add error handling into data conversion components.
Instead, if the database throws a conversion error, you can easily look at the data in your staging table to examine the problem. Lastly, SQL is much more forgiving with date conversions than ssis.
If the data source is reliable
If the dates and numbers are always dates and numbers, I would define the datatypes in the connection manager. This makes it clear what you are expecting from the file and makes the package easier to maintain with fewer components.
Additionally, if you go to the advanced properties of the flatfile source, integers and dates can be set to fast parse which will speed up the read time: https://msdn.microsoft.com/en-us/library/8893ea9d-634c-4309-b52c-6337222dcb39?f=255&MSPPError=-2147217396
When I use data conversion
I rarely use the data conversion component. But one case I find it useful is for converting from / to unicode. This could be necessary when reading from an ado.net source which always treats the input as unicode, for example.
You could change the output data type in the flat file connection manager in Advanced page or right click the source in Data flow, Advanced editor to change the data type before loading it.
I think one of the benefit is the conversion transformation could allow you output the extra column, usually named copy of .., which in some case, you might use both of the two columns. Also, sometimes when you load the data from Excel source, all coming with Unicode, you need to use Data conversion to do the data TF, etc.
Also, just FYI, you could also use Derived Column TF to convert the data type.
UPDATE [Need to be further confirmed]:
From the flat file source connection manager, the maximum length of string type is 255, while in the Data Conversion it could be set over 255.
I am building a Data Flow Task in an SSIS package that pulls data in from an OLE DB Source (MS Access table), converts data types through a Data Conversion Transformation, then routes that data to an OLE DB Destination (SQL Server table).
I have a number of BIT columns for flag variables in the destination table and am having trouble with truncation when converting these 1/0 columns to (DT_BYTES,1). Converting from DT_WSTR and DT_I4 to (DT_BYTES,1) results in the same truncation, and I have verified that it is happening at that step through the Data Viewer.
It appears that I need to create a derived column similar to what is described in the answers to the question linked below, but instead of converting to DT_BOOL, I need to convert to (DT_BYTES,1), as casting from DT_BOOL to DT_BYTES is apparently illegal?
SSIS Converting a char to a boolean/bit
I have made several attempts at creating a derived column with variations of the logic below, but haven’t had any luck. I am guessing that I need to use Hex literals in the “1 : 0” portion of the expression, but I haven’t been able to find valid syntax for that:
(DT_BYTES,1)([Variable_Name] == (DT_I4)1 ? 1 : 0)
Am I approaching this incorrectly? I can’t be the first person to need to insert BIT data into a SQL Server table, and the process above just seems unnecessarily complex to me.
I am uploading some data from DB2 to SQL2005. The table contains one text field that is 2000 characters long in DB2 but is set up as a varchar(2000) in SQL.
When I run my insert in query browser it processes fine and the data is copied correctly, however when the same statement runs in my stored procedure the data for this field only is copied incorrectly and is shown as a series of strange characters.
The field can occasionally contain control characters however the problem occurs even when it doesn't.
Is there a setting i need to apply to my stored procedure in order to get the insert to work correctly?
Or do i need to use a cast or convert on the data in order to get it appearing correctly.
Thanks.
Update: Thanks for your suggestions. It now appears that the problem was caused by the software that we used to create the linked server containing the DB2 database. This could not handle a field 2000 characters long when run via a stored procedure but could when run interactively.
I ultimately got around the problem by splicing the field into 10 200 character long fields for the upload and then re-joined them again when they were in the SQL database.
It sounds like the data coming from db2 is in a different character set. You should make sure it isn't EBCDIC or something like that.
Also, try calling your stored procedure from the SQL Server Management Studio (query browser), to see if it works at all.
You might want to change your varchar(2000) to an nvarchars(2000) (in the stored procedure as well as the table - i assume it exists as a parameter). This would allow them to hold two byte characters. It'll depend on the DB2 configuration but it may be that it's exporting UTF-16 (or similar) rather than UTF-8.
This problem was caused by the 3rd party software that we used to create the linked server to the DB2 database. This could not handle a field 2000 characters long when run via a stored procedure but could when run interactively.
I ultimately got around the problem by splicing the field into 10 200 character long fields for the upload and then re-joined them again when they were in the SQL database.