I am building a Data Flow Task in an SSIS package that pulls data in from an OLE DB Source (MS Access table), converts data types through a Data Conversion Transformation, then routes that data to an OLE DB Destination (SQL Server table).
I have a number of BIT columns for flag variables in the destination table and am having trouble with truncation when converting these 1/0 columns to (DT_BYTES,1). Converting from DT_WSTR and DT_I4 to (DT_BYTES,1) results in the same truncation, and I have verified that it is happening at that step through the Data Viewer.
It appears that I need to create a derived column similar to what is described in the answers to the question linked below, but instead of converting to DT_BOOL, I need to convert to (DT_BYTES,1), as casting from DT_BOOL to DT_BYTES is apparently illegal?
SSIS Converting a char to a boolean/bit
I have made several attempts at creating a derived column with variations of the logic below, but haven’t had any luck. I am guessing that I need to use Hex literals in the “1 : 0” portion of the expression, but I haven’t been able to find valid syntax for that:
(DT_BYTES,1)([Variable_Name] == (DT_I4)1 ? 1 : 0)
Am I approaching this incorrectly? I can’t be the first person to need to insert BIT data into a SQL Server table, and the process above just seems unnecessarily complex to me.
Related
Have a large number of (TSV) files that need to be imported (periodically) (via SSIS package) into existing MSSQL DB tables. Getting many data type issues from the OLE DB Destination tasks eg:
[Flat File Source [2]] Error: Data conversion failed. The data conversion for column "PRC_ID" returned status value 2 and status text "The value could not be converted because of a potential loss of data.".
and the type suggestions from the connection managers for each table from Flat File Source tasks are not accurate enough to prevent errors when running the import package (and the DB types are the correct ones, so don't want to just make them all (wrongly) strings for the sake of loading the TSVs).
Is there a way to load the type data for the columns in some single file rather than one by one in the connection manager window for the Flat File Source tasks (this would be hugely inconvenient as each table may have many fields)?
I have the creation statements that were used to create each of the tables that the TSVs correspond to, could that be used in any way? Can a Flat File Source inherent data types for columns from its OLE DB Destination? Any other ways to avoid having to set each type by hand?
There is no difference between changing columns data type from the Flat File Source, and between keeping all data types as string and linking them to OLE DB Destination (different data types). Since both methods are performing Implicit Data conversion since flat file are text files and store all data as text (columns don't have metadata).
Change data types from Advanced Editor vs Data Conversion Transformation
If you are looking to automatically set the data types, I don't think there is another way to do that other than the solution you mentioned in the comments or creating package programmatically(even I don't find it useful to do it that way).
I am learning SSIS for a new requirement. I came across these two transformations - Data Conversion and Derived Column. But we can convert the datatypes in Derived Column itself. So what was the need that Microsoft added this 'data Conversion' transformation? I searched in google but did not get the proper answer.
Hope this helps:
The purpose of Data conversion is to do just data conversion.
While Derived column task is used for most of the transformations. To
achieve this, data conversion is also put in as a part of it. If you
are just intending to do data conversion and no other transform, for
simplicity and readability of the package.
Data conversion gives a simple UI to the end user for full filling the
requirement of changing the data type of incoming columns.
Derived columns also can help us to achieve data conversion but there
we have to explicitly write a code to type cast that.
To give an anology: you can read an Excel file in the dataflow using an Excel source or an OLE DB Source. Doesn't mean the Excel source shouldn't be there. It's easier to use.
Source: Code project
I am trying to import "Financial data" from Excel files in to sql table. Problem I am facing is that My ssis package is incrementing decimal values. e.g -(175.20) from Excel is being loaded as "-175.20000000000005" in SQL.
I am using nVArChar (20) in destination SQL table. Images attached. What's the best data type in destination table. I have done a lot of reading and people seem to suggest decimal data but Package throws error for Decimal data type.Need help please.
Ended up changing the Data type to "Currency" in my SQL destination. Then added a data conversion task to change "DT_R8" data type from excel source to "currency[DT_CY]. This resolved the issue. could have used decimal or Numeric (16,2)data type in my destination as well but then i just went ahead with currency and it worked.
You could use a Derived Column Transformation in your Data Flow Task, with an expression like ROUND([GM],2) (you might need to replace GM with whatever your actual column name is).
You can then go to the Advanced Editor of the Derived Column Transformation and set the data type to decimal with a Scale of 2 on the 'Input and Output Properties' tab (look under 'Derived Column Output').
You'll then be able to use a decimal data type in your SQL Server table.
I have found plenty online, but nothing specific to my problem. I have a CSV rendered in code page 65001 (Unicode). However, in the Advanced section of the Flat File Connection Manager, the column is data type string [DT_STR]
My database table I am loading to can be in any format; I don't care. My question is what is the best way to handle this?
1) Change the Advanced properties of the flat file connection columns?
2) Change the data types of the SQL table to NVARCHAR?
3) Change the OLE DB properties to AlwaysUseDefaultCodePage = TRUE?
4) Use Data Conversion task to convert the column data types?
If your source's code page doesn't change, my suggestion is to use a simple data conversion, try to avoid manipulating source and destination whenever possible. Always go for ETL solutions first.
I usually always start off with setting up my connection string for the flat file, then converting the data using a data conversion component (input/output datatypes) based on the flat file data types and destination datatypes. Then finally setting up the connection string for the DB destination. Below is an example of how my data flow looks.
This may be a stupid question but I must ask since I see it a lot... I have inherited quite a few packages in which developers will use the the Data Conversion transformation shape when dumping flat files into their respective sql server tables. This is pretty straight forward however I always wonder why wouldn't the developer just specify the correct data types within the flat file connection and then do a straight load into the the table?
For example:
Typically I will see flat file connections with columns that are DT_STR and then converted into the correct type within the package ie: DT_STR of length 50 to DT_I4. However, if the staging table and the flat file are based on the same schema - why wouldn't you just specify the correct types (DT_I4) in the flat file connection? Is there any added benefit (performance, error handling) for using the data conversion task that I am not aware of?
This is a good question with not one right answer. Here is the strategy that I use:
If the data source is unreliable
i.e. sometimes int or date values are strings, like when you have the literal word 'null' instead of the value being blank. I would let the data source be treated as strings and deal with converting the data downstream.
This could mean just staging the data in a table and using the database to do conversions and loading from there. This pattern avoid the source component throwing errors which is always tricky to troubleshoot. Also, it avoids having to add error handling into data conversion components.
Instead, if the database throws a conversion error, you can easily look at the data in your staging table to examine the problem. Lastly, SQL is much more forgiving with date conversions than ssis.
If the data source is reliable
If the dates and numbers are always dates and numbers, I would define the datatypes in the connection manager. This makes it clear what you are expecting from the file and makes the package easier to maintain with fewer components.
Additionally, if you go to the advanced properties of the flatfile source, integers and dates can be set to fast parse which will speed up the read time: https://msdn.microsoft.com/en-us/library/8893ea9d-634c-4309-b52c-6337222dcb39?f=255&MSPPError=-2147217396
When I use data conversion
I rarely use the data conversion component. But one case I find it useful is for converting from / to unicode. This could be necessary when reading from an ado.net source which always treats the input as unicode, for example.
You could change the output data type in the flat file connection manager in Advanced page or right click the source in Data flow, Advanced editor to change the data type before loading it.
I think one of the benefit is the conversion transformation could allow you output the extra column, usually named copy of .., which in some case, you might use both of the two columns. Also, sometimes when you load the data from Excel source, all coming with Unicode, you need to use Data conversion to do the data TF, etc.
Also, just FYI, you could also use Derived Column TF to convert the data type.
UPDATE [Need to be further confirmed]:
From the flat file source connection manager, the maximum length of string type is 255, while in the Data Conversion it could be set over 255.