SSIS Data conversion transformation - sql-server

in SSIS I read a csv file with column format - for example 1.25 or 2.50.
In the datatransformation task I transform into decimal dt_decimal scala 2.
In the datatable the column has the format decimal(18,2).
The data will be stored with 125.00 or 25.00 instead of 1.25 and 2.50.
What do I have to adjust?

Possible Issue causes
(1) Data type mismatch
I think that a similar issue is caused by data type mismatch between source and destination, or data transformation output and Destination.
(2) Numeric separators
Another cause may be if the numeric values contains a comma , as formatting such as thousands commas 1,000,000 and decimal separator . like 1.02.
Possible solutions
(1) Specify data type in source
To prevent this issue to be caused by data transformations and if your source data is formatted well. Then there is no need for Data Transformation. Inside the Flat File Connection Manager editor. GoTo Advanced Tab, Select the column that contains decimal and change its data type (try DT_NUMERIC and DT_DECIMAL) and precision and scale property.
If the issue still happening, be sure that both input and output has same metadata (precision and scale).
(2) Derived Column
Or you should use a Derived Column Transformation with a similar expression:
(DT_NUMERIC,18,2)[COLUMN]
(3) Replace Separator using derived column
You can replace separator using a derived column
(DT_NUMERIC,18,2)REPLACE([COLUMN], ",", ".")

Related

Data Conversion text to numeric in SSIS is removing characters

I am facing a strange issue while using SSIS "Data Conversion component" to convert string to decimal datatype. I use SSIS 2016.
The source data input has values of mixed data types- string, integer, decimal and is defined as varchar in the flat file source. The target data type expected is numeric. When explicit type conversion happens from string to decimal, we expect the alphanumeric values to get rejected to error table and only the numeric values to pass through.
Instead, we are seeing some alphanumeric values shedding the characters in the value and passing through successfully with no error.
Examples: Value "3,5" converted to 35
Value "11+" converted to 11
We do not have control over source data and will not be able to replace char data before passing data into Data conversion component.
We have tried the below steps as a workaround and it has worked.
i.e,
First Data Conversion from DT_STR to DT_NUMERIC
Capture error rows that fail the above conversion
Second Data Conversion from DT_NUMERIC to DT_DECIMAL
But as the source data is not reliable, we may have to apply this workaround wherever there are numeric fields (int types & deicmals) which is not a friendly solution.
So checking with you all to understand if there is an easier and better solution tried out by anyone.
I did not expect this result, but I tried an expression task and it worked for DT_DECIMAL:
(DT_DECIMAL,1)"11+" -- evaluates to 11.0
But it does not work for DT_NUMERIC. SSIS won't allow a direct numeric result, but it can be nested inside a cast to DT_DECIMAL. Just to demonstrate that, in an expression task even this "numerically valid" cast would not be permitted, because the output simply can't be of type DT_NUMERIC:
(DT_NUMERIC, 3, 0)123
But this is permitted:
(DT_DECIMAL,0)((DT_NUMERIC, 3, 0)123)
So as long as you are happy to specify a precision and scale big enough to hold your data during the "validity" check done by DT_NUMERIC, and then cast it from there to DT_DECIMAL, all in a derived column transform, then DT_NUMERIC seems to enforce the strict semantics you want.
SSIS allows this:
(DT_DECIMAL,0)((DT_NUMERIC, 2, 0)"11")
But not either of these:
(DT_DECIMAL,0)((DT_NUMERIC, 2, 0)"11+")
(DT_DECIMAL,0)((DT_NUMERIC, 2, 0)"3,5")
#billinkc Sorry for not responding to you earlier.
We are working under some restrictions:
(1) All we want to do is capture datatype issues in input data, so we wanted to harness the capability of SSIS Data Conversion Component in SSIS.
(2) DBA doesn't want us to use SQL for type conversions, so we are required to do these conversions between flat file source and flat file destination using SSIS.
(3) We are required to capture the type conversion errors at every step of conversion into an error output file with error column name and error description, to be used later. So we cannot remove char data in the field before passing it to Data Conversion component.
#allmhuran - We have used Derived column task before Data Conversion component to replace unnecessary characters in one of the other fields, but using the same for type conversion makes achieving (3) difficult. Because error output from Derived column task and Data Conversion component cannot be redirected to the same error output file.
We can completely ignore Data Conversion component and use only Derived column task to do all type conversions, whether single or nested. I am trying this and the error descriptions do not always look good, but the cons of the former method can be overcome. I will try this out!

Import from XML drops leading 0

I have an SSIS package that imports an XML file into SQL. The data of one particular field could be '112' or '039', for example. It is always three characters and gets padded with a leading 0 if only two.
The Destination field in SQL is varchar. For some or other reason SSIS is changing it to DT_UI2 and in the case of '039', only '39' comes.
I have added a data conversion that converts it to DT_WSTR but this does not help
Use a derived column with the following expression:
RIGHT("000" + (DT_WSTR,50)[Source Column],3)
The XSD that was originally generated defined this field as unsigned short. Changing it to string and redoing the flow solved the problem

Need Help Performing Derived Column Split in SSIS

So in my Source Table from Excel I have a Column called real/min/max that counts population and I want to split this into 3 columns called ActualPop, MinPop, MaxPop.
So an example would be
real/min/max
33/1/50
And I would need this to populate in the new Columns as
ActualPop
33
MinPop
1
MaxPop
50
I tried the following Expressions:
ActualPop: TOKEN([real/min/max],"/",1)
MinPop: TOKEN([real/min/max],"/",2)
MaxPop: TOKEN([real/min/max],"/",3)
The issue is when I try to do my mapping to the SQL destination, I get an error about the Data Types. The destination has INT data types mean while in the Derived Column Editor I see the Data Types are Unicode String. I have tried to use the Data Conversion but that still does not work.
You can change the data type of the 3 derived columns using the Advanced Editor on the Derived Column.
then.... select four-byte signed integer [DT_I4] for each of your INTs.
See: Changing Datatype in SSIS Derived column

SSIS - Converting DT_TEXT(Length 11,000 Characters) to DT_STR and trim to 1,000 characters

I want to read data from text file (.csv), truncate one of the column to 1000 characters and push into SQL table using SSIS Package.
The input (DT_TEXT) is of length 11,000 characters but my Challenge is ...
SSIS can convert to (DT_STR) only if Max length is 8,000 characters.
String operations cannot be performed on Stream (DT_TEXT data type)
Got a workaround/solution now;
I truncate the text in Flat File Source and selected the option to Ignore the Error;
Please share if you find a better solution!
FYI:
To help anyone else that finds this, I applied a similar concept more generally in a data flow when consuming a text stream [DT_TEXT] in a Derived Column Transformation task to transform it to [DT_WSTR] type to my defined length. This more easily calls out the conversion taking place.
Expression: (DT_WSTR,1000)(DT_STR,1000,1252)myLargeTextColumn
Data Type: Unicode string [DT_WSTR]
Length: 1000
*I used 1252 codepage since my DT_TEXT is UTF-8 encoded.
For this Derived Column, I also set the TruncationRowDisposition to RD_IgnmoreFailure in the Advanced Editor (or can be done in the Configure Error Output, setting Truncation to "Ignore failure")
(I'd post images but apparently I need to boost my rep)

A 99.99 numeric from flat file doesn't want to go in a NUMERIC(4,2)'SQL Server

I have a csv file :
1|1.25
2|23.56
3|58.99
I want to put this value in a SQL Server table with SSIS.
I have created my table :
CREATE TABLE myTable( ID int, Value numeric(4,2));
My problem is that I have to create a Derived Column Transformation to specify my cast :
(DT_NUMERIC,4,2)(REPLACE(Value,".",","))
Otherwise, SSIS don't seem to be able to put my Value in my column, and fill my column with null value.
And I think it is tooooo ugly to do it this way. I want my Derived Column Transformation be here for real new derived column, and not some simple cast that I think SSIS have to detect.
So, what is the standard way to use SSIS to resolve this problem ?
BULK
INSERT myTable
FROM 'c:\csvtest1.txt'
WITH
(
FIELDTERMINATOR = '|',
ROWTERMINATOR = '\n'
)
csvtest1.txt
1|1.25
2|23.56
3|58.99
You're loading this up in international format (56,99 in lieu of 56.99). You need to load this as 56.99 for SQL Server to recognize it as such. Take out the REPLACE(Value, ".", ",") and just have the code be:
(DT_NUMERIC,4,2)(Value)
Handle the formatting on the application side, not on the data side. The comma is a reserved operator in SQL Server and you can't change that fact.
Haven't used SSIS a whole lot, but can't you set the regional settings on the File Source or at least set the decimal separator?
Can you change your SSIS source column to be in the correct datatype?
If you have control over the production of your file, I'd suggest you to format values without ANY decimal or thousand separation : in this case I'ld have a file with values:
1|125
2|2356
3|5899
and then apply a division by 100 when importing the data. While it has the advantage of being culture-independent, of course it has some drawbacks:
1) First of all, it may not be possible to impose this format of the file.
2) It presumes that all numeric values are formatted accordingly, in this case every value is multiplied by 100; this can be an issue if you have to mix values from countries with different decimal positions (many have two decimals, but some have zero decimals).
3) It may severely impact with other routines, maybe out of your control
Therefore, this can really be an option if you have total control on the csv file.

Resources