This question is related to my previous question but is different in focus, so I'm creating a new question. I have a CSV file that has only column names in it, no data. I've created a Flat File Connection Manager in SSIS using Visual Studio 2012. I've checked Column Names in the first data row in the Connection Manager. Two column names must end in periods (Employee No., and Hrs.). Those columns are present with the periods in my CSV file. Comma, not period, is the delimiter in the Flat File Connection Manager. However, when I point the Connection Manager to the CSV file and click on Columns in the Connection Manager, the periods are replaced by spaces. Further, if I create a Flat File Destination, point it to my Flat File Connection Manager and run my SSIS package, the column names have spaces instead of periods. Is this a bug or am I doing something wrong?
Open the Flat File connection manager, Go To Advanced Tab and change to columns names (Add periods instead of spaces)
Similar question
Add Special Charachter in SSIS Flat File Column Header
Related
I have an Data Flow step in an SSIS package that simply reads data from a CSV with 180 or so columns and inserts it into a MS SQL Server table.
It works.
However, there's a CSV file with 110,000+ and it fails. In the Output window in Visual Studio there a message that says:
The data conversion for column "Address_L2" returned status value 2 and status text "The value could not be converted because of a potential loss of data.
In the Flat File Connection Manager Editor, the data type for the column is string [DT_STR] 50. TextQualified is True.
The SQL column with the same name is a varchar(100).
Anyhow, in the Flat File Source Editor I set all Truncation errors to be ignored, so I don't think this has to do with truncation.
My problem is identifying the "offending" data.
In the same Output window it says:
... SSIS.Pipeline: "Staging Table" wrote 93217 rows.
I looked at row 93218 and a few before and after (Notepad++, Excel, SQL) and nothing caught my attention.
So I went ahead and removed rows from the CSV file up to what I thought was the offending row and when I tried the process again I got the same error, but when I look at the last entry that was actually inserted into the SQL table it doesn't match the last, or close to the last rows in the CSV file.
Is it because it doesn't necessarily insert them in the same order?
In any case, how do I know what the actual issue is, especially with a file this size that you can't go through it manually?
You can simply change the length of the column in the flat file connection manager to meet the destination table specifications. Just open the flat file connection manager, go to the Advanced tab and change the column length.
Note that you can select multiple columns and change the data type and length at once
You could add an Error output to the SSIS component which is causing the error (not sure from your question whether it's the flat file source or the Staging Table destination).
Hook up the Error output to "nowhere" (I use the Konesans Trash Destination), activate a data viewer on it, and select just the problem column (along with any thing which helps you identify the row) into the data viewer. Run in Visual Studio, and you'll see which rows are failing.
How can I load BAI2 file to SSIS?
.BAI2 is an industry standard format used by the banks. Below is the one truncated example
01,021000021,CST_USER,110520,1610,1627,,,2/
02,CST_USER,089900137,1,110509,1610,,2/
03,000000370053368,USD,010,782711622,,,015,7620008 12,,,040,760753198,,/
88,043,760000052,,,045,760010026,,,050,760000040,, ,055,760000045,,/
Use a Flat file connection manager
I think you can import these files using a flat file connection manager, because they are similar to comma separated text, try to change the row delimiter and column delimiter properties to find the appropriate one.
From the example you mentioned i think you should use:
, as Column delimiter
/ as Row delimiter
To learn more about how to interpret a BAI2 file check the following link:
EBS – How to interpret a BAI2 file
Based on this link:
The BAI2 file is a plain text file (.TXT Format), which contains values / texts one after the other.
Because the number of columns is not fixed among all rows than you must use define only one column (DT_STR,4000) in the flat file connection manager, and split columns using a Script Component:
SSIS ragged file not recognized CRLF
how to check column structure in ssis?
SSIS : Creating a flat file with different row formats
Helpful links
SQL SERVER – Import CSV File into Database Table Using SSIS
Importing Flat Files with Inconsistent Formatting Using SSIS
SSIS Lesson 2: First Package
I have a SSIS package in which I use a ForEach Container to loop through a folder destination and pull a single .csv file.
The Container takes the file it finds and uses the file name for the ConnectionString of a Flat File Connection Manager.
Within the Container, I have a Data Flow Task to move row data from the .csv file (using the Flat File Connection Manager) into an OLEDB destination (this has another OLEDB Connection Manager it uses).
When I try to execute this container, it can grab the file name, load it into the Flat File Connection Manager, and begin to transfer row data; however, it continually errors out before moving any data - namely over two issues:
Error: 0xC02020A1 at Move Settlement File Data Into Temp Table, SettlementData_YYYYMM [1143]: Data conversion failed. The data conversion for column ""MONTHS_REMAIN"" returned status value 2 and status text "The value could not be converted because of a potential loss of data.".
Error: 0xC02020A1 at Move Settlement File Data Into Temp Table, Flat File Source [665]: Data conversion failed. The data conversion for column ""CUST_NAME"" returned status value 4 and status text "Text was truncated or one or more characters had no match in the target code page.".
In my research so far, I know that you can set what conditions to force an error-out failure and choose to ignore failures from Truncation in the Connection Manager; however, because the Flat File Connection Manager's ConnectionString is re-made each time the Container executes, it does not seem to hold on to those option settings. It also, in my experience, should be picking the largest value from the dataset when the Connection Manager chooses the OutputColumnWidth for each column, so I don't quite understand how it is truncating names there (the DB is set up as VARCHAR(255) so there's plenty of room there).
As for the failed data conversions, I also do not understand how that can happen when the column referenced is using simple Int values, and both the Connection Manager AND the receiving DB are using floats, which should encompass the Int data (am I unaware that you cannot convert Int into Float?).
It's been my experience that some .csv files don't play well in SSIS when going directly into a DB destination; so, would it be better to transform the .csv into a .xlsx file, which plays much nicer going into a DB, or is there something else I am missing to easily move massive amounts of data from a .csv file into a DB - OR, am I just being stupid and turning a trivial matter into something bigger than it is?
Note: The reason I am dynamically setting the file in the Flat File Connection Manager is that the .csv file will have a set name appended with the month/year it was produced as part of a repeating process, and so I use the constant part of the name to grab it regardless of the date info
EDIT:
Here is a screen cap of my Flat File Connection Manager previewing some of the data that it will try to pipe through. I noticed some of these rows have quotes around them, and wanted to make sure that wouldn't affect anything adversely - the column having issues is the MONTHS_REMAIN one
Is it possible that one of the csv files in the suite you are processing is malformed? For instance, if one of the files had an extra column/comma, then that could force a varchar column into an integer column, producing error similar to the ones you have described. Have you tried using error row redirection to confirm that all of your csv files are formed correctly?
To use error row redirection, update your Flat File Source and adjust the Error Output settings to redirect rows. Your Flat File Source component will now have an extra red arrow which you can connect to a destination. Drag the red arrow from your source component to a new conditional split. Next, right-click the red line and add dataviewer. Now, when error rows are processed, they will flow over the red line into the data viewer so you can examine them. Last, Execute the package and wait for the dataviewer to capture the errant rows for examination.
Do the data values captured by the data viewer look correct? Good luck!
I am generating a Flat file from OLEDB Source using SSIS. I have specified column headers and mapped them with source columns but SSIS automatically remove special character like '/' from column Header. How can I enforce SSIS to not remove any special character from the header? Is there any way to generate a file having a special character in column name or SSIS not allowed it?
You can add special characters to Flat File Header as the following:
Add a flat file connection manager
At the flat file connection manager editor, Go to advanced Tab and rename your column
Remark: these special characters will be ignored when using SSIS objects like Script Component
I am trying to import data from a utf-8 encoded flat file into SQL Server 2008 using SSIS. This is what the end of the row data looks like in Notepad++:
I have a couple more images showing what the file connection manager looks like:
You can see that the data shows correctly in the file connection manager preview. When I try to import this data, no rows import. I get an error message indicating that the row delimiter was not found. You can see in the file connection manager images that the header row delimiter and the row delimiter are both set to {LF}. This was sufficient to generate the correct preview, so I am lost to why it did not work to import. I have tried a number of things that have brought zero results:
Tried using the Wizard import in SSMS...same results
Tried using data conversion, no impact
Tried setting the row delimiter to (0a), same results
[Flat File Source [582]] Warning: The
end of the data file was reached while
reading header rows. Make sure the
header row delimiter and the number of
header rows to skip are correct.
Thanks for looking at this and I really appreciate any help you can offer.
Cause:
SSIS fails to read the file and displays the below warning due to the column delimiter Ç ("c" with cedilla) and not due to the line delimiter {LF} (Line Feed).
[Read flat file [1]] Warning: The end of the data file was reached while
reading header rows. Make sure the header row delimiter and the number of
header rows to skip are correct.
Here is a sample SSIS package that shows how to resolve the issue using Script Component and at the end there is another example that simulates your issue.
Resolution:
Below sample package is written in SSIS 2008 R2. It reads a flat file with row delimiter {LF} as a single column value; then splits the data using Script Component to insert the information into a table in SQL Server 2008 R2 database.
Use Notepad++ to create a simple flat file with few rows. The below sample file has Product Id and List Price information on each row separated by Ç as column delimiter and each row ends with {LF} delimiter.
On the Notepad++, click Encoding and then click Encoding in UTF-8 to save the flat file in UTF-8 encoding.
The sample will use an SQL Server 2008 R2 database named Sora. Create a new table named dbo.ProductListPrice using the below given script. SSIS will insert the flat file data into this table.
USE Sora;
GO
CREATE TABLE dbo.ProductListPrice
(
ProductId nvarchar(30) NOT NULL
, ListPrice numeric(12,2) NOT NULL
);
GO
Create an SSIS package using Business Intelligence Development Studio (BIDS) 2008 R2. Name the package as SO_6268205.dtsx. Create a data source named Sora.ds to connect to the database Sora in SQL Server 2008 R2.
Right-click anywhere inside the package and then click Variables to view the variables pane. Create a new variable named ColumnDelimiter of data type String in the package scope SO_6268205 and set the variable with the value Ç
Right-click on the Connection Managers and click New Flat File Connection... to create a connection to read the flat file.
On the General page of the Flat File Connection Manager Editor, perform the following actions:
Set Connection manager name to ProductListPrice
Set Description to Flat file connection manager to read product list price information.
Select the flat file path. I have the file in the path C:\Siva\StackOverflow\Files\6268205\ProductListPrice.txt
Select {LF} from Header Row Delimiter
Check Column names in the first data row
Click Columns page
On the Columns page of the Flat File Connection Manager Editor, verify that the Column delimiter is blank and disabled. Click Advanced page.
On the Advanced page of the Flat File Connection Manager Editor, perform the following actions.
Set the Name to LineData
Verify that the Column delimiter is set to {LF}
Set the DataType to Unicode string [DT_WSTR]
Set the OutputColumnWidth to 255
Click the Preview page.
On the Preview page of the Flat File Connection Manager Editor, verify that the displayed data looks correct and click OK.
You will see the data source Sora and the flat file connection manager ProductListPrice on the Connection Managers tab at the bottom of the package.
Drag and drop Data Flow Task onto the Control Flow tab of the package and name it as File to database - Without Cedilla delimiter
Double-click the Data Flow Task to switch the view to the Data Flow tab on the package. Drag and drop a Flat File Source on the Data Flow tab. Double-click the Flat File Source to open Flat File Source Editor.
On the Connection Manager page of the Flat File Source Editor, select the Flat File Connection Manager ProductListPrice and click Columns page.
On the Columns page of the Flat File Source Editor, check the column LineData and click OK.
Drag and drop a Script Component onto the Data Flow tab below the Flat File Source, select Transformation and click OK. Connect the green arrow from Flat File Source to Script Component. Double-click Script Component to open Script Transformation Editor.
Click Input Columns on Script Transformation Editor and select LineData column. Click Inputs and Outputs page.
On the Inputs and Outputs page of the Script Transformation Editor, perform the following actions.
Change the inputs name to FlatFileInput
Change the outputs name to SplitDataOutput
Select Output Columns and click Add Column. Repeat this again to add another column.
Name the first column ProductId
Set the DataType of column ProductId to Unicode string [DT_WSTR]
Set the Length to 30
On the Inputs and Outputs page of the Script Transformation Editor, perform the following actions.
Name the second column ListPrice
Set the DataType of column ListPrice to numeric [DT_NUMERIC]
Set the Precision to 12
Set the Scale to 2
Click Script page to modify the script
On the Script page of the Script Transformation Editor, perform the following actions.
Click the ellipsis button against ReadOnlyVariables and select the variable User::ColumnDelimiter
Click Edit Script...
Paste the below C# in the Script Editor. The script performs the following tasks.
Using the column delimiter value Ç defined in the variable User::ColumnDelimiter, the method FlatFileInput_ProcessInputRow splits the incoming value and assigns it to the two output columns defined in the Script Component transformation.
Script component code in C#
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
public override void PreExecute()
{
base.PreExecute();
}
public override void PostExecute()
{
base.PostExecute();
}
public override void FlatFileInput_ProcessInputRow(FlatFileInputBuffer Row)
{
const int COL_PRODUCT = 0;
const int COL_PRICE = 1;
char delimiter = Convert.ToChar(this.Variables.ColumnDelimiter);
string[] lineData = Row.LineData.ToString().Split(delimiter);
Row.ProductId = String.IsNullOrEmpty(lineData[COL_PRODUCT])
? String.Empty
: lineData[COL_PRODUCT];
Row.ListPrice = String.IsNullOrEmpty(lineData[COL_PRICE])
? 0
: Convert.ToDecimal(lineData[COL_PRICE]);
}
}
Drag and drop OLE DB Destination onto the Data Flow tab. Connect the green arrow from Script Component to OLE DB Destination. Double-click OLE DB Destination to open OLE DB Destination Editor.
On the Connection Manager page of the OLE DB Destination Editor, perform the following actions.
Select Sora from OLE DB Connection Manager
Select Table or view - fast load from Data access mode
Select [dbo].[ProductListPrice] from Name of the table or the view
Click Mappings page
Click Mappings page on the OLE DB Destination Editor would automatically map the columns if the input and output column names are same. Click OK.
Data Flow tab should look something like this after configuring all the components.
Execute the query select * from dbo.ProductListPrice in the SQL Server Management Studio (SSMS) to find the number of rows in the table. It should be empty before executing the package.
Execute the package. You will notice that the package successfully processed 9 rows. The flat file contains 10 lines but the first row is header with column names.
Execute the query select * from dbo.ProductListPrice in the SQL Server Management Studio (SSMS) to find the 9 rows successfully inserted into the table. The data should match with flat file data.
The above example illustrated how to manually split the data using Script Component because the Flat File Connection Manager encounters error when configured the column delimiter Ç
Issue Simulation:
This example shows a separate Flat File Connection Manager configured with column delimiter Ç, which executes but encounters a warning and does not process any lines.
Right-click on the Connection Managers and click New Flat File Connection... to create a connection to read the flat file. On the General page of the Flat File Connection Manager Editor, perform the following actions:
Set Connection manager name to ProductListPrice_Cedilla
Set Description to Flat file connection manager with Cedilla column delimiter.
I have the file in the path C:\Siva\StackOverflow\Files\6268205\ProductListPrice.txt Select the flat file path.
Select {LF} from Header Row Delimiter
Check Column names in the first data row
Click Columns page
On the Columns page of the Flat File Connection Manager Editor, perform the following actions:
Set Row delimiter to {LF}
The column delimiter field may be disabled. Click Reset Columns
Set Column delimiter to Ç
Click Advanced page
On the Advanced page of the Flat File Connection Manager Editor, perform the following actions:
Set the Name to ProductId
Set the ColumnDelimiter to Ç
Set the DataType to Unicode string [DT_WSTR]
Set the Length to 30
Click column ListPrice
On the Advanced page of the Flat File Connection Manager Editor, perform the following actions:
Set the Name to ListPrice
Set the ColumnDelimiter to {LF}
Set the DataType to numeric [DT_NUMERIC]
Set the DataPrecision to 12
Set the DataScale to 2
Click OK
Drag and drop a Data Flow task onto the Control Flow tab and name it as File to database - With Cedilla delimiter. Disable the first data flow task.
Configure the second data flow task with Flat File Source and OLE DB Destination
Double-click the Flat File Source to open Flat File Source Editor. On the Connection Manager page of the Flat File Source Editor, select the Flat File Connection Manager ProductListPrice_Cedilla and click Columns page to configure the columns. Click OK.
Execute the package. All the components will display green color to indicate that the process was success but no rows will be processed. You can see that there are no rows numbers indication between the Flat File Source and OLE DB Destination
Click the Progress tab and you will notice the following warning message.
[Read flat file [1]] Warning: The end of the data file was reached while
reading header rows. Make sure the header row delimiter and the number of
header rows to skip are correct.
Answer above seems awfully complicated, just convert the line endings in the file
Dim FileContents As String = My.Computer.FileSystem.ReadAllText("c:\Temp\UnixFile.csv")
Dim NewFileContents As String = FileContents.Replace(vbLf, vbCrLf)
My.Computer.FileSystem.WriteAllText("c:\temp\WindowsFile.csv", NewFileContents, False, New System.Text.UnicodeEncoding)
Rehashed from here
This issue also arises if you are trying to consume FlatFile generated on a different platform like Unix, Mac etc via SSIS on windows
In such a scenario all you need to do is convert the file format from say UNIX to DOS with unix2dos command
unix2dos file-to-convert