I've got an SSIS package (SQL 2014) that loads data from a table into a flat file. The file has 5 columns, however there is one row in my dataset that is used by the system for duplicate checking, and its required to have 3 columns, instead of 5.
How my file looks like now:
ID|Desc|UDF1|UDF2|UDF3
DUPECHECK|SaysSomethingIrellevant|||
ID1|Desc1|||
ID2|Desc2|||
How I want my file to look:
ID|Desc|UDF1|UDF2|UDF3
DUPECHECK|SaysSomethingIrellevant|
ID1|Desc1|||
ID2|Desc2|||
You can see how the second row of the file should have a different number of columns than the rest of the rows. How am I able to do this?
You cannot do it. The only way I did it(I had to write a file with a header row and footer row which had different number of columns) is to either write everything (all the columns) to a row with a single column or to write three different txt files and then combine the the three file using a bat file.
Related
Currently I receive a daily file of around 750k rows and each row has a 3 character identifier at the start.
For each identifier, the number of columns can change but are specific to the identifier (e.g. SRH will always have 6 columns, AAA will always have 10 and so on).
I would like to be able to automate this file into an SQL table through SSIS.
This solution is currently built in MSACCESS using VBA just looping through recordsets using a CASE statement, it then writes a record to the relevant table.
I have been reading up on BULK INSERT, BCP (w/Format File) and Conditional Split in SSIS however I always seem to get stuck at the first hurdle of even loading the file in as SSIS errors due to variable column layouts.
The data file is pipe delimited and looks similar to the below.
AAA|20180910|POOL|OPER|X|C
SRH|TRANS|TAB|BARKING|FORM|C|1.026
BHP|1
*BPI|10|16|18|Z
BHP|2
*BPI|18|21|24|A
(* I have added the * to show that these are child records of the parent record, in this case BHP can have multiple BPI records underneath it)
I would like to be able to load the TXT file into a staging table, and then I can write the TSQL to loop through the records and parse them to their relevant tables (AAA - tblAAA, SRH - tblSRH...)
I think you should read each row as one column of type DT_WSTR and length = 4000 then you need to implement the same logic written using vba within a Script component (VB.NET / C#), there are similar posts that can give you some insights:
SSIS ragged file not recognized CRLF
SSIS reading LF as terminator when its set as CRLF
How to load mixed record type fixed width file? And also file contain two header
SSIS Flat File - CSV formatting not working for multi-line fileds
how to skip a bad row in ssis flat file source
I want to create a flat file output, where format of rows is different.
file has header, middle data rows, footer row.
file will look Like below
H|deptcode123|deptNameXYZ|totalemp300
E|Sam|Johnson|address1|empCode1|........many other columns
E|Sam2|Johnson2|address2|empCode2|........many other columns
E|Sam4|Johnson3|address3|empCode3|........many other columns
E|Sam5|Johnson4|address4|empCode4|........many other columns
J|300|250000
How can I generate this file in SSIS. Input will come from different tables, I am planning to write 3 separate queries/ sp's to get the header, middle row and footer row record.
To do this you need a data flow and connection manager for each different type of rowset. For example to have different header, body, and footer you would need 3 dataflows and 3 flat file connection managers. Each flat file connection manager points to the same file. The trick is to make sure the setting Overwrite data in the file in the Flat File destination is unchecked. This way each data flow executes and appends to the file and each data flow can have its discrete columns and data types.
If you want to create a flat file where rows has with different metadata. You have to use a one column flat file connection manager. With Dt_WStr data type and length = 4000
Use 3 consecutive DataFlow task using the same Flat file destination
First one write the header, second one the middle rows, third one the footer.
You can concatenate values from the select statment or using a Script Component
I have multiple flat files containing 126 columns , but each of them is without column names. How should i add column names to these files using SSIS. These files are needed to be imported using SSIS so that i can perform transformation on these files.
Do you want to create a new file which has column names or just assign field names to the columns for use in the rest of the package?
Whichever way, if the input file does not contain column names then set them up as follows…
Create a dataflow task and in the dataflow task create a flat file source.
Configure the flat file source and create a new Flat File Connection Manager
Browse to the input file you want and un-tick the Column Names In First Row
Select Advanced and change all of the default names (Coulmn 0, Coulmn 1 etc) into the field names (and types) you want.
Click OK
If you need to create a new file that has the column names in it, just create a flat file destination and this time have the Column Names In First Row turned on, wire it up to the input you created and save it to a new file
One way to do this (maybe not the quickest way) is by using the Advanced Editor.
Right click on the Excel Source component and select Show advanced editor.
In the new window, you need to go to the Input and Output Properties. You should have this by now:
Click on a column under Output Columns (F1,F2,...)
In Common properties, edit the Name to what you want.
I added a derived column component as my next step and this is what I see under the aviable columns:
As you can see, F1 (which I edited in step 2) has a new column name now.
Edit: I somehow assumed you needed this for Excel. Anyways, I hope it helps.
I have a problem with a CSV file with open source flat file in an SSIS package.
I set up 10 columns in the file, but I try to detect those lines that have 9 or fewer columns or more than 10 on my own.
If I declare Omit the error, omit the entire line. If I declare Redirect line, not continuous by the red arrow. If I declare Component error, fail to detect a line that contains 10 columns.
Any suggestions?
You will have to handle this with a Script task.
You can either use the Script to pre-handle the rows that don't have 10 columns, and then send only a .csv file with 10 columns in every row to the DataFlow, or you can just do then entire import in the script task, handling every row according to its contents one at a time.
I have a CSV file where there is a header row and data rows in the same file.
I want to get information from both rows during the same load.
What is the easiest way to do this?
i.e File Example - Import.CSV
2,11-Jul-2011
Mr,Bob,Smith,1-Jan-1984
Ms,Jane,Doe,23-Apr-1981
In the first row, there a a count of the number of rows and the date of transmission.
In the second and subsequent rows is the actual data, in this Title, FirstName, LastName, Birthdate
SQL Server Integration Services Conditional Split Transformation should do it.
I wonder what would You do with that info in the pipeline. However, there is only one solution to read it in one pass (take a look at notes/limitations at the end):
Create a data flow
Put File source component and set it the way You want
Add script task to count the number of rows
Put conditional split transformation where condition is mycounter=0
One path from condition split will be the first row of file (mycounter=0) and the other path will be the rest of the rows (2 in your example).
Note#1: file source can set only one metadata for each column in the source. This means that if your first column of data is string (Mr, Ms, ...) then You have to set it as string data type in the source. Otherwise, if You set it as integer (DT_Ix) it
will fail as soon as it encounters row with string data (Mr, Ms, ...) in the first column of file. This applies to all columns, not just the first one.
Note #2: SSIS will see only the number of columns You told it to. This means that You have to have the same number of columns in EACH row. Otherwise, You have ragged csv file and You need to take another approach - search the Internet. But those solutions also require different layout of csv.
Answers in the following links explain how to load parent-child data from a flat file into an SQL Server database when both parent and child rows exist in the same file next to each other.
How do I split flat file data and load into parent-child tables in database?
How to load a flat file with header and detail data into a database using SSIS package?