I have a SQL table as Source and I want to export it's contents to a Flat file using SSIS.
Simplified example:
Source: Notes table (CreatedBy, Description, CreatedOn)
The Description field is nText.
Destination: Flat file - Fixed length
CreatedBy(0-50)
Description (51-250)
CreatedOn (251-270)
The problem is that the description can be really long and we don't want it to be truncated after 200 chars. It should wrap to the next line.
I cannot find a way to do this using SSIS.
Really appreciate your help.
Update
I am looking to achieve a layout like below:
CreatedBy | Description | CreatedOn|
John Really long description.............. 2/2/2017
more text..
John2 This is the second line 2/3/2017
Hadi answer allows to break a long string into parts but still doesn't solve the layout problem.
You have to follow these steps:
In The DataFlow Taskadd s script component between the OLEDB Source and Flat File Destination
In the Script Component Mark Description Column as Input, Add OutDescription Column as Output Column of type DT_WSTR and length 200
In the Script window write the following code (Inside Input0_RowProcessing Method:
If Not Row.Description_IsNull AndAlso
Not String.IsNullOrEmpty(Row.Description.Trim) Then
If Row.Description.Trim.Length > 200 Then
Dim LongString As String = Row.Description.Trim
Dim longlist As New System.Collections.Generic.List(Of String)
Dim idx As Integer = 0
While idx <= LongString.Length
If LongString.Length < idx + 200 Then
longlist.Add(LongString.Substring(idx))
Else
longlist.Add(LongString.Substring(idx, 200))
End If
idx += 200
End While
Row.OutDescription = String.Join(vbNewLine & "".PadLeft(50,CChar(" ")), longlist)
Else
Row.OutDescription = Row.Description
End If
Else
Row.OutDescription_IsNull = True
End IF
In Flat File Destination map the OutDesciption column instead of Description column
Related
Currently my code have simple tables containing the data needed for each object like this:
infantry = {class = "army", type = "human", power = 2}
cavalry = {class = "panzer", type = "motorized", power = 12}
battleship = {class = "navy", type = "motorized", power = 256}
I use the tables names as identifiers in various functions to have their values processed one by one as a function that is simply called to have access to the values.
Now I want to have this data stored in a spreadsheet (csv file) instead that looks something like this:
Name class type power
Infantry army human 2
Cavalry panzer motorized 12
Battleship navy motorized 256
The spreadsheet will not have more than 50 lines and I want to be able to increase columns in the future.
Tried a couple approaches from similar situation I found here but due to lacking skills I failed to access any values from the nested table. I think this is because I don't fully understand how the tables structure are after reading each line from the csv file to the table and therefore fail to print any values at all.
If there is a way to get the name,class,type,power from the table and use that line just as my old simple tables, I would appreciate having a educational example presented. Another approach could be to declare new tables from the csv that behaves exactly like my old simple tables, line by line from the csv file. I don't know if this is doable.
Using Lua 5.1
You can read the csv file in as a string . i will use a multi line string here to represent the csv.
gmatch with pattern [^\n]+ will return each row of the csv.
gmatch with pattern [^,]+ will return the value of each column from our given row.
if more rows or columns are added or if the columns are moved around we will still reliably convert then information as long as the first row has the header information.
The only column that can not move is the first one the Name column if that is moved it will change the key used to store the row in to the table.
Using gmatch and 2 patterns, [^,]+ and [^\n]+, you can separate the string into each row and column of the csv. Comments in the following code:
local csv = [[
Name,class,type,power
Infantry,army,human,2
Cavalry,panzer,motorized,12
Battleship,navy,motorized,256
]]
local items = {} -- Store our values here
local headers = {} --
local first = true
for line in csv:gmatch("[^\n]+") do
if first then -- this is to handle the first line and capture our headers.
local count = 1
for header in line:gmatch("[^,]+") do
headers[count] = header
count = count + 1
end
first = false -- set first to false to switch off the header block
else
local name
local i = 2 -- We start at 2 because we wont be increment for the header
for field in line:gmatch("[^,]+") do
name = name or field -- check if we know the name of our row
if items[name] then -- if the name is already in the items table then this is a field
items[name][headers[i]] = field -- assign our value at the header in the table with the given name.
i = i + 1
else -- if the name is not in the table we create a new index for it
items[name] = {}
end
end
end
end
Here is how you can load a csv using the I/O library:
-- Example of how to load the csv.
path = "some\\path\\to\\file.csv"
local f = assert(io.open(path))
local csv = f:read("*all")
f:close()
Alternative you can use io.lines(path) which would take the place of csv:gmatch("[^\n]+") in the for loop sections as well.
Here is an example of using the resulting table:
-- print table out
print("items = {")
for name, item in pairs(items) do
print(" " .. name .. " = { ")
for field, value in pairs(item) do
print(" " .. field .. " = ".. value .. ",")
end
print(" },")
end
print("}")
The output:
items = {
Infantry = {
type = human,
class = army,
power = 2,
},
Battleship = {
type = motorized,
class = navy,
power = 256,
},
Cavalry = {
type = motorized,
class = panzer,
power = 12,
},
}
I have one CSV file where the information is spread on two lines
Line 1 contains Name and age
Line 2 contains detail like address, city, salary, occupation
I want to combine 2 rows to insert it in a database.
CSV file :
Raju, 42
12345 west andheri,Mumbai, 100000, service
In SQL Server I can do by using cursor. But I have to do in SSIS.
For a similar case, i will read each line as one column and use a script component to fix the structure. You can follow my answer on the following question. It contains a step-by-step guide:
SSIS reading LF as terminator when its set as CRLF
I like using a script component in order to be able to store data from a different row in this case.
Read the file as a single column CSV into Column1.
Add script component and add a new Output called CorrectedOutput and define all columns from both rows. Also, mark Column1 as read.
Create 2 variables outside of row processing to 'hold' first row
string name = string.Empty;
string Age = string.Empty;
Use a split to determine line 1 or line 2
string[] str = Row.Column1.Split(',');
Use an if to determine row 1 or 2
if(str.Length == 2)
{
name = str[0];
age=str[1];}
else
{
CorrectedOutputBuffer.AddRow();
CorrectedOutputBuffer.Name = name; //This uses the stored value from prior row
CorrectedOutputBuffer.Age = age; //This uses the stored value from prior row
CorrectedOutputBuffer.Address = str[0];
CorrectedOutputBuffer.City = str[1];
CorrectedOutputBuffer.Salary = str[2];
CorrectedOutputBuffer.Occupation = str[3];
}
The overall effect is this...
On Row 1, you just hold the data in variables
On Row 2, you write out the data to 1 new row.
I'm trying to load the entirety of each row in a csv file into a variant column.
my copy into statement fails with the below
Error parsing JSON:
Which is really odd as my data isn't JSON and I've never told it to try and validate it as json.
create or replace file format NeilTest
RECORD_DELIMITER = '0x0A'
field_delimiter = NONE
TYPE = CSV
VALIDATE_UTF8 = FALSE;
with
create table Stage_Neil_Test
(
Data VARIANT,
File_Name string
);
copy into Stage_Neil_Test(Data, File_Name
)
from (select
s.$1, METADATA$FILENAME
from #Neil_Test_stage s)
How do I stop snowflake from thinking it is JSON?
You need to explicitly cast the text into a VARIANT type, since it cannot auto-interpret it as it would if the data were JSON.
Simply:
copy into Stage_Neil_Test(Data, File_Name
)
from (select
s.$1::VARIANT, METADATA$FILENAME
from #Neil_Test_stage s)
Using SSIS I am bringing in raw text files that contain this in the output:
I use this data later to report on. The Key columns get pivoted. However, I don't want to show all those columns individually, I only want to show the total.
To accomplish this my idea was calculate the Sum on insert using a trigger, and then insert the sum as a new row into the data.
The output would look something like:
Is what I'm trying to do possible? Is there a better way to do this dynamically on pivot? To be clear I'm not just pivoting these rows for a report, there are other ones that don't need the sum calculated.
Using derived column and Script Component
You can achieve this by following these steps:
Add a derived column (name: intValue) with the following expression:
(DT_I4)(RIGHT([Value],2) == "GB" ? SUBSTRING([Value],1,FINDSTRING( [Value], " ", 1)) : "0")
So if the value ends with GB then the number is taken else the result is 0.
After that add a script component, in the Input and Output Properties, click on the Output and set the SynchronousInput property to None
Add 2 Output Columns outKey , outValue
In the Script Editor write the following script (VB.NET)
Private SumValues As Integer = 0
Public Overrides Sub PostExecute()
MyBase.PostExecute()
Output0Buffer.AddRow()
Output0Buffer.outKey = ""
Output0Buffer.outValue = SumValues.ToString & " GB"
End Sub
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
Output0Buffer.AddRow()
Output0Buffer.outKey = Row.Key
Output0Buffer.outValue = Row.Value
SumValues += Row.intValue
End Sub
I am going to show you a way but I don't recommend adding total to the end of the detail data. If you are going to report on it show it as a total.
After source add a data transformation:
C#
Add two columns to your data flow: Size int and type string
Select Value as readonly
Here is the code:
string[] splits = Row.value.ToString().Split(' '); //Make sure single quote for char
int goodValue;
if(Int32.TryParse(splits[0], out goodValue))
{
Row.Size = goodValue;
Row.Type = "GB";
}
else
{
Row.Size = 0;
Row.Type="None";
}
Now you have the data with the proper data types to do arithmatic in your table.
If you really want the data in your format. Add a multicast and an aggregate and SUM(Size) and then merge back into your original flow.
I was able to solve my problem in another way using a trigger.
I used this code:
INSERT INTO [Table] (
[Filename]
, [Type]
, [DeviceSN]
, [Property]
, [Value]
)
SELECT ms.[Filename],
ms.[Type],
ms.[DeviceSN],
'Memory Device.Total' AS [Key],
CAST(SUM(CAST(left(ms.[Value], 2) as INT)) AS VARCHAR) + ' GB' as 'Value'
FROM [Table] ms
JOIN inserted i ON i.Row# = ms.Row#
WHERE ms.[Value] like '%GB'
GROUP BY ms.[filename],
ms.[type],
ms.[devicesn]
I have a working copy of an application that will open workbooks/sheets and copy data succesffully between the two then saves but I need to parse some data as I copy it into another cell.
I was thinking..
~ create array
~ get all values in xlSourceFile.worksheets("sheet1") and store into an array
~ parse through the array extracting the data I need (text-to-column programatically)
~ write the array data to two specific columns in excel worksheet
the data I am trying to parse is Firstname / Lastname - Email and I want this as a result:
Joe Shmoe to go into one column // Joe Shmoes Email to another column.
I am writing this in vb.net using the imports Microsoft.Office.Interop to manipulate Excel.
Excuse the formatting, I'm new to SO. This is VBA but I believe the general logic will work. It assumes that the email address has no space padding after it. It searches backward on the raw combined string for the first blank space and flags that as the start of the email address (end of the name).
It loops out when the next cell is empty.
The data is assumed to look like this:
"First Name Last Name myaddress#example.com"
For Each cell In Worksheets("Sheet1").Range("A:A")
i = i + 1
If cell = "" Then GoTo loopout
rawstring = cell.Value
'rawString = "First Name Last Name myaddress#example.com"
emailStartPosition = InStrRev(rawstring, " ")
myname = Left(rawstring, emailStartPosition)
myemail = Right(rawstring, Len(rawstring) - emailStartPosition)
Worksheets("Sheet1").Range("B" & i).Value = myname
Worksheets("Sheet1").Range("C" & i).Value = myemail
Next
loopout:
End Sub
Column B will have the name and Column C will have the email address.