I have a query,
I have 3 CTE's produced by a TSQL query, which are used as a list of union-ed demographics, we have a requirement to submit this file in a particular format with a header and footer row in each file saved as csv. Can someone suggest the best method to produce the following format from SSRS, ultimately the header and footer will be generated from a SQL table which will hold the available submission ID's, I am wondering if this format can be automated from SSRS, or if there is a better approach, as you can see the header and footer appear to be tab separated, without the additional columns, then the comma separated data block, followed by another tab separated footer.
Any suggestions would be appreciated - thanks very much
format required below (dummy data):
0001011RNL DBS 473520180116
10,6374635,19540714,,,7253647987,p1,,john,,2,3 MADE UP ST,,local,CUMBRIA,,ca12 0T1,,,,,,,,A82635,,
10,9283746,19650325,,,7536482965,p2,,peter,,2,38 MADE UP ST,,WIGTON,,,ca12 0T3,,,,,,,,A82045,,
9901011RNL DBS 473520180116
UPDATED based on user suggestions: SSIS appears to be the best approach - I have found the following resource which appears to be what is required, I have taken this from https://www.sqlservercentral.com/Forums/Topic815666-148-1.aspx
provided by John Dempsey
Place 2 Data Flow Tasks on the Control Flow tab.
Use the first Data Flow Task to generate your Header and Detail records (rows)
a. Obtain the data from your data source
b. Transform your data as needed to satisfy the results for your Detail Rows
c. Set up your Flat File Destination in the layout for your Detail Rows. (Don't worry about Header yet)
d. Upon opening your Flat File Destination task you will notice a item for the "Header:". If you were to enter something in the box here it would be hard-coded as the header for you within your file. But, instead of it being hard-coded you can make it dynamic. You can create a SSIS variable to store your dynamically built header information, then assign it to the Header property of the Flat File Destination in the 1st Data Flow task.
Set dynamic Header SSIS variable to Header property of Flat File Destination in 1st Data Flow Task.
a. On the Control Flow Tab select the first Data Flow Task then go to the properties window for task
b. Expand [+] Expressions property and click the [...] button
c. In the Property Expressions Editor window click in the left column to bring up a list of properties available within the Data Flow Task for you Header and Detail Rows.
d. Choose the property [Flat File Destination].[Header] (if you haven't renamed it yet) setting the expression to the name of your SSIS variable name you created for your dynamic header.
Your dynamic header should now show up when you run the first data flow task with your detail rows.
Open up the second data flow and create a source for your footer, I used a script component as an input data source, mapping the source row to an SSIS variable that contained my Footer Row.
Create your Flat File Destination for you footer making sure that you set the Overwrite property = False. Your Flat File Destination will be the same file name as your header and detail file name from you first Data Flow Task, but the layout will be different in order to line up with the footer.
Upon running you SSIS, the package will run through executing your first Data Flow Task which writes the data from the Header property, then each detail row. Then, it will run the second Data Flow Task writing the footer row to the end of the same file used in your Header and Detail Data Flow Task.
Related
I have an SSIS package that has this Foreach Loop Container(with File Enumerator) that reads from a folder with multiple CSVs file and then upload the data into a flat table.
This is working fine but my problem is trying to also extract the filenames of the file and then populate the last column in the flat table after inserting a row.
I have also added an execute SQL task after the Data Flow task(within the ForEach Loop Container) hoping that it would execute straight away before the loop goes to the next file, but unfortunately this is not the behavior.
The execute SQL task will only execute after all the data in all the files is read. Is there a way to do this filename update row by row, such as read a row from the CSV file, insert this row into the table, update the row in the filename column in the same table, and then read the next row? Continue this way until the CSV is read completely then move to the next CSV file and do the same.
I have a programming background and slightly feel that nested for loops could be a way but not sure how to achieve this in SSIS. The setup of my ForEach loop container is shown below:
Why using an Execute SQL Task to add the file name?!
You can simply add the file name into the data pipeline using one of the following methods:
(1) Using the FileNameColumnName property
In the Data Flow Task, you can simply right-click on the Flat File Source, and click on the Show Advanced Editor option.
In the Flat File Source Advanced Editor, there is a property called FileNameColumnName. This property is used to add a column to the flat file source where the File Name is added.
You should only write the value of the column name and it will be added to the flat file source.
Flat File Custom Properties
Extract the File Name in SSIS Data Flows using the FileNameColumnName Property
(2) Using a Derived Column Transformation
Your issue can be solved by adding a Derived Column Transformation within the Data Flow Task. Then, add a column to the data pipeline using the variable that contains the File Name. (The variable used in the ForEach Loop Container's variables mappings tab)
You can learn more about Derived Column Transformation in the following article:
SSIS Derived Columns with Multiple Expressions vs Multiple Transformations
Similar questions:
How to find which flat file contains data errors while loading multiple flat files using Foreach File enumerator in SSIS
I need to import salary data from multiple excel files where filename of each file is a date.
I used SSIS and with success followed typical tutorials for importing multiple excel files. The thing is none of them show simple method how to add one extra column (with the name of the file) to the result. There are some tutorials with huge code scripts, that are too complicated for me.
What I did was to add 'Derived column' module between typical 'excel source' and 'OLE BD destination' where I added a new column 'date' with expression #[User::FileName] - a variable that is used for 'foreach loop container' but as a result I received corectly combined data from all files but the extra column contains the same data - the filename of the first imported file.
I wonder if there is any simple to make the variable I used to change with every loop ? So, as a result, I receive combined data plus one extra column containing the corresponding date, which is the name of each file. Many Thanks
If you are using a Foreach File Enumerator, select "Name only" in the Collection pane as in the image below (I'm assuming that "Name only" will give you the date that you are looking for).
This allows you to map this into a variable on each iteration. To do this, navigate to the Variable Mappings pane, and select the variable you want to use in your Data Flow Task, with 0 as the Index.
You can then add this variable as a Derived Column, and it will give you the name of the file you are importing.
I've got a data flow task which has a conditional split which then leads to two different flat file destination. The thing that is puzzling me, is why do I have different 'available destination columns' in the flat file editor's mappings tab than I do in the alternative flat file editor's mapping tab.
I'm hoping this is a Newbie error, but it's had me stumped all morning.
At the bottom of the screen, is a box with the connection manager.
If you have two different flatfile destination then you should have a connection manager for each. Inside those objects is where the columns for each destination is defined. Double click them and examine the column definitions for each. They are probably different, which is why you have a different set of columns available on the mapping page for each of your destinations.
I want to create a flat file output, where format of rows is different.
file has header, middle data rows, footer row.
file will look Like below
H|deptcode123|deptNameXYZ|totalemp300
E|Sam|Johnson|address1|empCode1|........many other columns
E|Sam2|Johnson2|address2|empCode2|........many other columns
E|Sam4|Johnson3|address3|empCode3|........many other columns
E|Sam5|Johnson4|address4|empCode4|........many other columns
J|300|250000
How can I generate this file in SSIS. Input will come from different tables, I am planning to write 3 separate queries/ sp's to get the header, middle row and footer row record.
To do this you need a data flow and connection manager for each different type of rowset. For example to have different header, body, and footer you would need 3 dataflows and 3 flat file connection managers. Each flat file connection manager points to the same file. The trick is to make sure the setting Overwrite data in the file in the Flat File destination is unchecked. This way each data flow executes and appends to the file and each data flow can have its discrete columns and data types.
If you want to create a flat file where rows has with different metadata. You have to use a one column flat file connection manager. With Dt_WStr data type and length = 4000
Use 3 consecutive DataFlow task using the same Flat file destination
First one write the header, second one the middle rows, third one the footer.
You can concatenate values from the select statment or using a Script Component
I have a CSV file where there is a header row and data rows in the same file.
I want to get information from both rows during the same load.
What is the easiest way to do this?
i.e File Example - Import.CSV
2,11-Jul-2011
Mr,Bob,Smith,1-Jan-1984
Ms,Jane,Doe,23-Apr-1981
In the first row, there a a count of the number of rows and the date of transmission.
In the second and subsequent rows is the actual data, in this Title, FirstName, LastName, Birthdate
SQL Server Integration Services Conditional Split Transformation should do it.
I wonder what would You do with that info in the pipeline. However, there is only one solution to read it in one pass (take a look at notes/limitations at the end):
Create a data flow
Put File source component and set it the way You want
Add script task to count the number of rows
Put conditional split transformation where condition is mycounter=0
One path from condition split will be the first row of file (mycounter=0) and the other path will be the rest of the rows (2 in your example).
Note#1: file source can set only one metadata for each column in the source. This means that if your first column of data is string (Mr, Ms, ...) then You have to set it as string data type in the source. Otherwise, if You set it as integer (DT_Ix) it
will fail as soon as it encounters row with string data (Mr, Ms, ...) in the first column of file. This applies to all columns, not just the first one.
Note #2: SSIS will see only the number of columns You told it to. This means that You have to have the same number of columns in EACH row. Otherwise, You have ragged csv file and You need to take another approach - search the Internet. But those solutions also require different layout of csv.
Answers in the following links explain how to load parent-child data from a flat file into an SQL Server database when both parent and child rows exist in the same file next to each other.
How do I split flat file data and load into parent-child tables in database?
How to load a flat file with header and detail data into a database using SSIS package?