From SSIS, I need to send an Excel file in the format of;
656265 | 56280212 ||
654581 | 56246894 ||
656111 | 56281475 ||
I currently have a package that generates an Excel file to be displayed as;
656265 | 56280212
654581 | 56246894
656111 | 56281475
as I have set the column delimiter as a Vertical bar (pipe) on the flat file connection manager.
How would I achieve attaching the 2 pipes to delimit the rows as SSIS does not allow you to set the same row and column delimiter?
You should open the flat file connection manager, go to the "Advanced" Tab and add two columns manually at the end of the file by pressing the "New" button:
Add a new column by clicking New. By default, the New button adds a new column at the end of the list.
Reference:
Flat File Connection Manager Editor (Advanced Page)
One of the solutions by Using "Script Task"
and read the file data then edit it as below example:
using System;
public class HelloWorld
{
public static void Main(string[] args)
{
//string str = File.ReadAllText("textFilePath");
string str =#"656265 | 56280212
654581 | 56246894
656111 | 56281475";
str= str.Replace(Environment.NewLine,"||" + Environment.NewLine) +" ||";
Console.WriteLine (str);
}
}
Output
656265 | 56280212 ||
654581 | 56246894 ||
656111 | 56281475 ||
Related
I'm doing a simple concatenation in SSIS.
For example I have a table like this:
+--------+-----------+------------+--------+
| ID | COL_A | COL_B | COL_C |
+--------+-----------+------------+--------+
| 110-99 | | APPLE | Orange |
+--------+-----------+------------+--------+
| 111-01 | Mango | Palm | |
+--------+-----------+------------+--------+
| 111-02 | | Strawberry | |
+--------+-----------+------------+--------+
| 111-05 | Pineapple | Guava | Lemom |
+--------+-----------+------------+--------+
I'm doing this in SSIS Derived column. Concatenation of 3 columns with Pipe |
COL_A +"|"+COL_B+"|"+COL_C
Actual Result:
|APPLE|Orange
MANGO|Palm|
|Strawberry|
Pineapple|Guava|Lemom
Expected Result:
APPLE|Orange
MANGO|Palm
Strawberry
Pineapple|Guava|Lemom
I'm not sure how to remove those extra | when the value is empty. I have tried using CASE but it is not working. Actually I don't know how to use CASE in Derived column expression.
You execute conditional logic in SSIS expressions using ?: syntax. ? : (Conditional) (SSIS Expression) It works much like an inline IFF.
Along these lines:
boolean_expression ? returnIfTrue : returnIfFalse
In order to get your desired results, I think I'd use two derived column transformations. In the first one, I'd create the pipe delimited string, then in the second one, I'd trim off the trailing pipe if there was one after building the string. Otherwise, the conditional logic would get pretty hairy in order to avoid leaving a trailing delimiter.
Step one - If each column is NULL or an empty string, return an empty string. Otherwise, return the contents of the column with a pipe concatenated to it:
((ISNULL(COL_A) || COL_A == "") ? "" : COL_A + "|"
Repeat that logic for all three columns, putting this expression into your derived column (Line breaks added for readability here):
(((ISNULL(COL_A) || COL_A == "") ? "" : COL_A + "|" ) +
(((ISNULL(COL_B) || COL_B == "") ? "" : COL_B + "|" ) +
(((ISNULL(COL_C) || COL_C == "") ? "" : COL_C ) --No pipe here, since nothing follows.
Then, in the second transformation, trim the trailing pipes from the instances where the last column or two were empty:
(RIGHT(NewColumnFromAbove,1)=="|") ? LEFT(NewColumnFromAbove,LEN(NewColumnFromAbove)-1) : NewColumnFromAbove
On the other hand, if there are lots of columns, or if performance gets bogged down, I would strongly consider writing the concatenation into a stored procedure, using CONCAT_WS, and then invoke that from an Execute SQL task instead.
In SQL Server, one option is concat_ws(), which ignores null values by design. If you have empty strings, your can turn them to null values with nullif().
concat_ws(
' | ',
nullif(col_a, ''),
nullif(col_b, ''),
nullif(col_c, '')
)
I have one CSV file where the information is spread on two lines
Line 1 contains Name and age
Line 2 contains detail like address, city, salary, occupation
I want to combine 2 rows to insert it in a database.
CSV file :
Raju, 42
12345 west andheri,Mumbai, 100000, service
In SQL Server I can do by using cursor. But I have to do in SSIS.
For a similar case, i will read each line as one column and use a script component to fix the structure. You can follow my answer on the following question. It contains a step-by-step guide:
SSIS reading LF as terminator when its set as CRLF
I like using a script component in order to be able to store data from a different row in this case.
Read the file as a single column CSV into Column1.
Add script component and add a new Output called CorrectedOutput and define all columns from both rows. Also, mark Column1 as read.
Create 2 variables outside of row processing to 'hold' first row
string name = string.Empty;
string Age = string.Empty;
Use a split to determine line 1 or line 2
string[] str = Row.Column1.Split(',');
Use an if to determine row 1 or 2
if(str.Length == 2)
{
name = str[0];
age=str[1];}
else
{
CorrectedOutputBuffer.AddRow();
CorrectedOutputBuffer.Name = name; //This uses the stored value from prior row
CorrectedOutputBuffer.Age = age; //This uses the stored value from prior row
CorrectedOutputBuffer.Address = str[0];
CorrectedOutputBuffer.City = str[1];
CorrectedOutputBuffer.Salary = str[2];
CorrectedOutputBuffer.Occupation = str[3];
}
The overall effect is this...
On Row 1, you just hold the data in variables
On Row 2, you write out the data to 1 new row.
Using SSIS, I am importing a .txt file, which for the most part if straight forward.
The file being imported has a set amount of columns up to a point, but there is a free text/comments field, which can repeat to unknown length, similar to below.
"000001","J Smith","Red","Free text here"
"000002","A Ball","Blue","Free text here","but can","continue"
"000003","W White","Green","Free text here","but can","continue","indefinitely"
"000004","J Roley","Red","Free text here"
What I would ideally like to do (within SSIS) is to keep the first three columns as singular columns, but to merge any free-text ones into a single column. i.e. Merge/concatenate anything which appears after the 'colour' column.
So when I load this into an SSMS table, it appears like:
000001 | J Smith | Red | Free text here |
000002 | A Ball | Blue | Free text here but can continue |
000003 | W White | Green | Free text here but can continue indefinitely |
000004 | J Roley | Red | Free text here |
I do not see any easy solution. You can try something like below:
1. Load the complete raw data to a temp table (without any delimiter):
Steps:
Create temp table in Execute SQL Task
Create a data flow task, with flat file source (with Ragged Right format) and
OLEDB destination (usint #temp table create in previous task)
Set the delayValidation=True for connection manager and DFT
Set retainSameConnection=True for connection manager
Refer this to create temp table and using it.
2. Create T-SQL to separate the 3 columns (something like below)
with col1 as (
Select
[Val],
substring([Val], 1 ,charindex(',', [Val]) - 1) col1,
len(substring([Val], 1 ,charindex(',', [Val]))) + 1 col1Len
from #temp
), col2 as (
select
[Val],
col1,
substring([Val], col1Len, charindex(',', [Val], col1Len) - col1Len) as col2,
charindex(',', [Val], col1Len) + 1 col2Len
from col1
) select col1, col2, substring([Val], col2Len, 200) as col3
from col2
T-SQL Output:
col1 col2 col3
"000001" "J Smith" "Red","Free text here"
"000002" "A Ball" "Blue","Free text here","but can","continue"
"000003" "W White" "Green","Free text here","but can","continue","indefinitely"
3. Use the above query in OLEDB source in different data flow task
Replace double quotes (") as per your requirement.
This was a fun exercise:
Add a data flow
Add a Script Component (select Source)
Add 4 columns to Outputs ID, Name Color , FreeText all type string
edit script:
Paste the following namespaces up top:
using System.Text.RegularExpressions;
using System.Linq;
Paste the following code into CreateNewOutputRows:
string strPath = #"a:\test.txt"; \\put your file path in here
var lines = System.IO.File.ReadAllLines(strPath);
foreach (string line in lines)
{
//Code I stole to read CSV
string delimeter = ",";
Regex rgx = new Regex(String.Format("(\"[^\"]*\"|[^{0}])+", delimeter));
var cols = rgx.Matches(line)
.Cast<Match>()
.Select(m => m.Value.Trim().Trim('"'))
.Where(v => !string.IsNullOrWhiteSpace(v));
//create a column counter
int ctr = 0;
Output0Buffer.AddRow();
//Preset FreeText to empty string
string FreeTextBuilder = String.Empty;
foreach( string col in cols)
{
switch (ctr)
{
case 0:
Output0Buffer.ID = col;
break;
case 1:
Output0Buffer.Name = col;
break;
case 2:
Output0Buffer.Color = col;
break;
default:
FreeTextBuilder += col + " ";
break;
}
ctr++;
}
Output0Buffer.FreeText = FreeTextBuilder.Trim();
}
I have a SQL table as Source and I want to export it's contents to a Flat file using SSIS.
Simplified example:
Source: Notes table (CreatedBy, Description, CreatedOn)
The Description field is nText.
Destination: Flat file - Fixed length
CreatedBy(0-50)
Description (51-250)
CreatedOn (251-270)
The problem is that the description can be really long and we don't want it to be truncated after 200 chars. It should wrap to the next line.
I cannot find a way to do this using SSIS.
Really appreciate your help.
Update
I am looking to achieve a layout like below:
CreatedBy | Description | CreatedOn|
John Really long description.............. 2/2/2017
more text..
John2 This is the second line 2/3/2017
Hadi answer allows to break a long string into parts but still doesn't solve the layout problem.
You have to follow these steps:
In The DataFlow Taskadd s script component between the OLEDB Source and Flat File Destination
In the Script Component Mark Description Column as Input, Add OutDescription Column as Output Column of type DT_WSTR and length 200
In the Script window write the following code (Inside Input0_RowProcessing Method:
If Not Row.Description_IsNull AndAlso
Not String.IsNullOrEmpty(Row.Description.Trim) Then
If Row.Description.Trim.Length > 200 Then
Dim LongString As String = Row.Description.Trim
Dim longlist As New System.Collections.Generic.List(Of String)
Dim idx As Integer = 0
While idx <= LongString.Length
If LongString.Length < idx + 200 Then
longlist.Add(LongString.Substring(idx))
Else
longlist.Add(LongString.Substring(idx, 200))
End If
idx += 200
End While
Row.OutDescription = String.Join(vbNewLine & "".PadLeft(50,CChar(" ")), longlist)
Else
Row.OutDescription = Row.Description
End If
Else
Row.OutDescription_IsNull = True
End IF
In Flat File Destination map the OutDesciption column instead of Description column
I have a flat file that has 6 columns: NoteID, Sequence, FileNumber, EntryDte, NoteType, and NoteText. The NoteText column has 200 characters and if a note is longer than 200 characters then a second row in the file contains the continuation of the note. It looks something like this:
|NoteID | Sequence | NoteText |
---------------------------------------------
|1234 | 1 | start of note text... |
|1234 | 2 | continue of note.... |
|1234 | 3 | more continuation of first note... |
|1235 | 1 | start of new note.... |
How can I in SSIS combine the multiple rows of NoteText into one row so the row would like this:
| NoteID | Sequence | NoteText |
---------------------------------------------------
|1234 | 1 | start of note text... continue of note... more continuation of first note... |
|1235 | 1 | start of new note.... |
Greatly appreciate any help?
Update: Changing the SynchronousInputID to None exposed the Output0Buffer and I was able to use it. Below is what I have in place now.
Dim NoteID As String = "-1"
Dim NoteString As String = ""
Dim IsFirstRow As Boolean = True
Dim NoteBlob As Byte()
Dim enc As New System.Text.ASCIIEncoding()
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
If Row.NoteID.ToString() = NoteID Then
NoteString += Row.NoteHTML
IsFirstRow = True
Else
If IsFirstRow Then
Output0Buffer.AddRow()
IsFirstRow = False
End If
NoteID = Row.NoteID.ToString()
NoteString = Row.NoteHTML.ToString()
End If
NoteBlob = enc.GetBytes(NoteString)
Output0Buffer.SingleNoteHTML.AddBlobData(NoteBlob)
Output0Buffer.ClaimID = Row.ClaimID
Output0Buffer.UserID = Row.UserID
Output0Buffer.NoteTypeLookupID = Row.NoteTypeLookupID
Output0Buffer.DateCreatedUTC = Row.DateCreated
Output0Buffer.ActivityDateUTC = Row.ActivityDate
Output0Buffer.IsPublic = Row.IsPublic
End Sub
My problem now is that I had to convert the output column from Wstr(4000) to NText because some of the notes are so long. When it imports into my SQL table, it is just jibberish characters and not the actual notes.
In SQL Server Management Studio (using SQL), you could easily combine your NoteText field using stuff function with XML Path to combine your row values to a single column like this:
select distinct
noteid,
min(sequence) over (partition by n.noteid order by n.sequence) as sequence,
stuff((select ' ' + NoteText
from notes n1
where n.noteid = n1.noteid
for xml path ('')
),1,1,'') as NoteText
from notes n;
You will probably want to look into something along the line that does similar thing in SSIS. Check out this link on how to create a script component in SSIS to do something similar: SSIS Script Component - concat rows
SQL Fiddle Demo