I am relatively new to SSIS and its data types. I have successfully created a Data Flow task that imports data from a comma-delimited .txt flat file to SQL Server. An error occurs when running the task, at the point where a date field in the .txt file has 0.
For a Derived Column expression to convert the date fields with 0 to Null, I have come up with the following so far...
[Latest Bill Due Date]==0 ? NULL(DT_DATE) : (DT_DATE)[Latest Bill Due Date]
...but the logic isn't accepted and the error message appears:
The data types "DT_WSTR" and "DT_I4" are incompatible for binary operator "==". The operand types could not be implicitly cast into compatible types for the operation. To perform this operation, one or both operands need to be explicitly cast with a cast operator.
Thanks in advance for any direction.
I had a similar problem, when in a textfile there was a 00000000 value, had to convert it to null in a datetime column. What ended up working for me was stablishing the table with null value as default in the column and also adding a Script Component as Transformation. Add as a column output something like 'VerifyNullDateVar' and inside the script do something like
if (Row.DATEVAR == 0)
{
//do whatever you want to do if the input value is an actual date
Row.VerifyNullDateVar = 2;
}
else
Row.VerifyNullDateVar = 1;
DATEVAR is the input column you get from the textfile. After that, use a derived column to read the value from VerifyNullDateVar
VerifyNullDating == 1
VerifyNullDating == 2
Finally, you need to set up 2 OLEDB Destination, one when you can save a date value in the Table; and the other one when you ont save anything in it, that way it gets the default null value
Related
I am facing a strange issue while using SSIS "Data Conversion component" to convert string to decimal datatype. I use SSIS 2016.
The source data input has values of mixed data types- string, integer, decimal and is defined as varchar in the flat file source. The target data type expected is numeric. When explicit type conversion happens from string to decimal, we expect the alphanumeric values to get rejected to error table and only the numeric values to pass through.
Instead, we are seeing some alphanumeric values shedding the characters in the value and passing through successfully with no error.
Examples: Value "3,5" converted to 35
Value "11+" converted to 11
We do not have control over source data and will not be able to replace char data before passing data into Data conversion component.
We have tried the below steps as a workaround and it has worked.
i.e,
First Data Conversion from DT_STR to DT_NUMERIC
Capture error rows that fail the above conversion
Second Data Conversion from DT_NUMERIC to DT_DECIMAL
But as the source data is not reliable, we may have to apply this workaround wherever there are numeric fields (int types & deicmals) which is not a friendly solution.
So checking with you all to understand if there is an easier and better solution tried out by anyone.
I did not expect this result, but I tried an expression task and it worked for DT_DECIMAL:
(DT_DECIMAL,1)"11+" -- evaluates to 11.0
But it does not work for DT_NUMERIC. SSIS won't allow a direct numeric result, but it can be nested inside a cast to DT_DECIMAL. Just to demonstrate that, in an expression task even this "numerically valid" cast would not be permitted, because the output simply can't be of type DT_NUMERIC:
(DT_NUMERIC, 3, 0)123
But this is permitted:
(DT_DECIMAL,0)((DT_NUMERIC, 3, 0)123)
So as long as you are happy to specify a precision and scale big enough to hold your data during the "validity" check done by DT_NUMERIC, and then cast it from there to DT_DECIMAL, all in a derived column transform, then DT_NUMERIC seems to enforce the strict semantics you want.
SSIS allows this:
(DT_DECIMAL,0)((DT_NUMERIC, 2, 0)"11")
But not either of these:
(DT_DECIMAL,0)((DT_NUMERIC, 2, 0)"11+")
(DT_DECIMAL,0)((DT_NUMERIC, 2, 0)"3,5")
#billinkc Sorry for not responding to you earlier.
We are working under some restrictions:
(1) All we want to do is capture datatype issues in input data, so we wanted to harness the capability of SSIS Data Conversion Component in SSIS.
(2) DBA doesn't want us to use SQL for type conversions, so we are required to do these conversions between flat file source and flat file destination using SSIS.
(3) We are required to capture the type conversion errors at every step of conversion into an error output file with error column name and error description, to be used later. So we cannot remove char data in the field before passing it to Data Conversion component.
#allmhuran - We have used Derived column task before Data Conversion component to replace unnecessary characters in one of the other fields, but using the same for type conversion makes achieving (3) difficult. Because error output from Derived column task and Data Conversion component cannot be redirected to the same error output file.
We can completely ignore Data Conversion component and use only Derived column task to do all type conversions, whether single or nested. I am trying this and the error descriptions do not always look good, but the cons of the former method can be overcome. I will try this out!
I am creating a package in SSIS, and want to convert a file with one large column into multiple columns.
I have a table containing several rows with a single column of raw data. The data was copied from a notepad file, and each row contains pipe delimiters to separate each column, but because it is a notepad file, each row is copied as one large column. I want to convert each column per row to multiple columns based on their start/end positions.
I tried using SSIS Derived Column Transformation with the SUBSTRING function, but the Data Type is automatically populated as text stream[DT_TEXT], and I get the following error:
Error at [Derived Column[113]]; The function “SUBSTRING”
does not support the data type “DT_TEXT” for parameter number 1. The
type of the parameter could not be implicitly cast into a compatible
type for the function. To perform this operation, the operand needs
to be explicitly cast with a cast operator.
Error at [Derived Column[113]]; Evaluating function
'SUBSTRING' failed with error code 0xC0047089.
Error at [Derived Column[113]]; Computing the expression
"SUBSTRING[RawData],1,5)" failed with error code 0xC00470C5. The
expression may have errors, such as divide by zero, that cannot be
detected at parse time, or there may be an out-of-memory error.
Error at [Derived Column[113]]; The expression
"SUBSTRING[RawData], 1,5)" on "Derived Column.Outputs[Derived Column
Output].Coluns[Derived Column 1] is not valid
Error at [Derived Column[113]]; Failed to set property
"Expression" on "Derived Column.Outputs[Derived Column
Output].Columns[Derived Column 1]. "
When I review other Derived Column Transformation illustrations utilizing SUBSTRING with a file containing individual columns, I notice the Data Type is shown as DT_WSTR.
Do I need to convert to this Data Type? If so, how do I explicitly cast DT_TEXT data types to DT_WSTR with a cast operator in SSIS Derived Column Transformation?
Otherwise, how else could I handle this conversion?
Derived Column Name: EmployerNo
Derived Column: Replace 'RawData'
Expression: SUBSTRING( [RawData], 1, 5 )
Data Type: text stream[DT_TEXT]
I expect the RawData column to be split up (converted) into 8 different columns based on their start and end positions.
Refering to SUBSTRING (SSIS Expression) documentation:
Returns the part of a character expression that starts at the specified position and has the specified length.
You have to convert DT_TEXT column to DT_STR/DT_WSTR before using Substring() function, you can do this using a Script Component, you can use a similar function:
string BlobColumnToString(BlobColumn blobColumn)
{
if (blobColumn.IsNull)
return string.Empty;
var blobLength = Convert.ToInt32(blobColumn.Length);
var blobData = blobColumn.GetBlobData(0, blobLength);
var stringData = Encoding.Unicode.GetString(blobData);
return stringData;
}
Or if the DT_TEXT length doesn't exceed the DT_STR length limit try using the following SSIS expression:
SUBSTRING( (DT_STR,1252,4000)[RawData], 1, 5 )
I'm importing thousands of csv files into an SQL DB. They each have two columns: Date and Value. In some of the files, the value column contains simply a period (ex: "."). I've tried to create a derived column that will handle any cell that contains a period with the following code:
FINDSTRING((DT_WSTR,1)[VALUE],".",1) != 0 ? NULL(DT_R8) : [VALUE]
But, when the package runs it gets the following error when it reaches the cell with the period in it:
The data conversion for column "VALUE" returned status value 2 and status text
"The value could not be converted because of a potential loss of data".
I'm guessing there might be an escape character that I'm missing in my FINDSTRING function but I can't seem to find what that may be. Does anyone have any thoughts on how I can get around this issue?
Trying to debug things like this is why I always advocate adding many Derived Columns to the Data Flow. It's impossible to debug that entire expression. Instead, first find the position of the period and add that as a new column. Then you can feed that into the ternary operation and bit by bit you can add data viewers to ensure you are seeing what you expect to see.
Personally, I'd take a different approach. It seems that you'd like to make any columns that are . into a null of type DT_R8.
Add a derived column, TrimmedValue and use this expression to remove any leading/trailing whitespace and then
RTRIM(LTRIM(Value))
Add a second derived column component, this time we'll add column MenopausalValue as it will remove the period. Use this expression
(TrimmmedValue == ".") ? Trimmedvalue : NULL(DT_WSTR, 50)
Now, you can add your final Derived Column wherein we convert the string representation of Value to the floating point representation.
IsNull(MenopausalValue) ? NULL(DT_R8) : (DT_R8) MenopausalValue
If the above shows an error, then you need to apply the following version as I can never remember the evaluation sequence for ternary operations that change type.
(DT_R8) (IsNull(MenopausalValue) ? NULL(DT_R8) : (DT_R8) MenopausalValue)
Examples of breaking these operations into many steps for debugging purposes
https://stackoverflow.com/a/15176398/181965
https://stackoverflow.com/a/31123797/181965
https://stackoverflow.com/a/33023858/181965
You can do it like this:
TRIM(Value) == "." ? NULL(DT_R8) : (DT_R8)Value
I'm using SSIS to load a fixed length Flat File into SQL.
I have a weight field that has been giving me trouble all day.
It has a length of 8 with 6 DECIMAL POSITIONS IMPLIED (99V990099).
The problem i'm having is when it isn't populated and has 8 spaces.
Everything i try gets an error:
"Invalid character value for cast specification"."
OR
"Conversion failed because the data value overflowed the specified type.".
OR
Data conversion failed.
The data conversion for column "REL_WEIGHT" returned status value 2 and status text
"The value could not be converted because of a potential loss of data.".
I've tried declaring it as DT_String & DT_Numeric.
I've tried many variations of:
TRIM([REL_WEIGHT])=="" ? (DT_STR,8,1252)NULL(DT_STR,8,1252) : REL_WEIGHT
ISNULL([REL_WEIGHT]) || TRIM([REL_WEIGHT]) == "" ? (DT_NUMERIC,8,6)0 : (DT_NUMERIC,8,6)[REL_WEIGHT]
TRIM(REL_WEIGHT) == "" ? (DT_NUMERIC,8,6)0 : (DT_NUMERIC,8,6)REL_WEIGHT
But nothing seems to work.
Please someone out there have the fix for this!
I think you may be running afoul of the following point, explained nicely at http://vsteamsystemcentral.com/cs21/blogs/applied_business_intelligence/archive/2009/02/01/ssis-expression-language-and-the-derived-column-transformation.aspx:
You can add a DT_STR Cast statement to the expression for the MiddleName, but it doesn't change the Data Type. Why can't we change the data type for existing columns in the Derived Column transformation? We're replacing the values, not changing the data type. Is it impossible to change the data type of an existing column in the Derived Column? Let's put it this way: It is not possible to convert the data type of a column when you are merely replacing the value. You can, however, accomplish the same goal by creating a new column in the Data Flow.
I've solved this on past occasions by loading the data from the flat file as strings, and then deriving a new column in a Derived Column transformation which is of numeric type. You can then perform the appropriate trimming, validation, casting, etc. in the SSIS expression for that new column in the transformation.
Here, I found an example SSIS expression I used at one point to derive a time value from a 4-digit string:
(ISNULL(Last_Update_Time__orig) || TRIM(Last_Update_Time__orig) == "") ? NULL(DT_DBTIME2,0) : (DT_DBTIME2,0)(SUBSTRING(TRIM(Last_Update_Time__orig),1,2)+":"+SUBSTRING(TRIM(Last_Update_Time__orig),3,2)+":00")
There has to be a better way to do it, But i found a way that works.
Create a Derived Column Expression:
TRIM(REL_WEIGHT) == "" ? (DT_STR,9,1252)"0.0000000" : (DT_STR,9,1252)(LEFT(REL_WEIGHT,2) + "." + RIGHT(REL_WEIGHT,6))
THEN Create a Data Conversion Task to change it to Numeric and set scale to 6.
And then Map the [Copy of NewField] to my SQL table field set up as Decimal(8,6).
I don't know how the performance will be of that when loading a million records, probably not the best. If someone knows how to do this in a better way performance wise please let me know.
Thanks,
Jeff
I have an SSIS package to load data; as you may recall there are flags that are in data files as Y/N char(1) when I am trying to load them as bit flags into SQL Server. I am specifying the columns in the data file as String [DT_STR] and I have a data conversion task to convert them to booleans based on the following expression (I received the same conversion error just specifying them as DT_BOOL to begin with, despite SSIS asking me to say what values it should consider as boolean):
[ColumnName] == "Y" ? (DT_BOOL)1 : (DT_BOOL)0
Running the package gives an error and tells me Invalid character value for cast specification and The value could not be converted because of a potential loss of data on the actual import to SQL Server (via an OLE DB Destination).
What am I missing here to get it to properly convert?
Try this:
(DT_BOOL)([ColumnName] == "Y" ? 1 : 0)
This also has the advantage of automatically setting the data type of the derived column correctly.
I was able to solve it by using a derived column and, instead of replacing the char columns, creating new columns set to type of DT_BOOL like so:
[Recycled] == "Y" ? True : False
I had the same problem with
(DT_BOOL)([ColumnName] == "Y" ? 1 : 0)
and I could only get it to work by taking OUT the "(DT_BOOL)" portion out of the expression and putting the job of converting it to a boolean onto the "Data Type" part by selecting "Boolean [DT_BOOL]. No problems after that.