Handling truncation error in derived column in data flow task - sql-server

I have a data flow task which contains a derived column. The derived column transforms a CSV file column, lets say A which is order number, to a data type char with length 10.
This works perfectly fine when the text file column is equal to or less than 10 characters. Of course, it throws an error when column A order number is more than 10 characters.
The column A (error prone).
12PR567890
254W895X98
ABC 56987K5239
485P971259 SPTGER
459745WERT
I would like to catch the error prone records and extract the order number only.
I already can configure error output from the derived column. But, this just ignores the error records and processes the others.
The expected output will process ABC 56987K5239, 485P971259 SPTGER order numbers as 56987K5239, 485P971259 respectively. The process removal of unexpected characters are not important, rather how to achieve this during the run time of the derived column (stripping and processing the data in case of error).

If the valid order number always starts with a number, and the length of it equal to 10. You could use Script Component (Transformation) together with Regular Expression to transform the source data.
Drag and drop the Script Component as Transformation
Connect the source to the Script Component
From the Script Component Edit window, checked the Order from the Input columns, and make it as Read and Write
In the script, add:using System.Text.RegularExpressions;
The full code needs to be added in the Input process method:
string pattern = "[0-9].{9}";
Row.Order = Regex.Match(Row.Order, pattern).Groups[1].ToString();
The output going to the destination should be the matched 10 characters starting with the number.

Related

Converting data in column in SSIS

I'm writing an SSIS package to load data from a .csv into a db.
There's a column in the csv file that is supposed to have a count, but the records sometimes have text, so I can't just load the data in as an integer. It looks something like this:
I want the data to land in the db destination as an integer instead of a string. I want the transformation to change any text to a 1, any blank value to a 1, and leave all the other numbers as-is.
My attempts have so far included using the Derived Column functionality, which I couldn't get the right expression(s) for it seems, and creating a temp table to run a sql query through the data, which kept breaking my data flow.
There are three approaches you can follow.
(1) Using a derived column
You should add a derived column with the following expression to check if the values are numeric or not:
(DT_I4)[count] == (DT_I4)[count] ? [count] : 1
Then in the derived column editor, go to the error output configuration and set the error handling event to Ignore failure.
Now add another derived column to replace null values with 1 :
REPLACENULL([count_derivedcolumn],1)
You can refer to the following article for a step-by-step guide:
Validate Numeric or Non-Numeric Data in SQL Server Integration Services without the Script Task
(2) Using a script component
If you know C# or Visual Basic.NET, you can add a script component to check if the value is numeric and replace nulls and string values with 1
(3) Update data in SQL
You can stage data in its initial form into the SQL database and use an update query to replace nulls and string values with 1 as follows:
UPDATE [staging_table]
SET [count] = 1
WHERE [count] IS NULL or ISNUMERIC([count]) = 0

Loop inside Pentaho Data Integration Transformation

I have a transformation as
where the text file is in the following format:
For each of the t_cmp(the number of t_cmp is not known prior) in the text file, I want to execute Read Company
But it is giving error as
Can anyone please tell me where am I going wrong?
You need to pass 3 rows, each with 1 field, instead of a single row with 3 fields.
The number of fields must match the number of parameters of your query.
So, in short, transpose your data. Either:
read line as a single field then use Split field to rows
or read as now and use Row normalizer
Both approaches should work.

SSIS Script Component - get raw row data in data flow

I am processing a flat file in SSIS and one of the requirements is that if a given row contains an incorrect number of delimiters, fail the row but continue processing the file.
My plan is to load the rows into a single column in SQL server, but during the load, I’d like to test each row during the data flow to see if it has the right number of delimiters, and add a derived column value to store the result of that comparison.
I’m thinking I could do that with a script task component, but I’m wondering if anyone has done that before and what would be the best method? If a script task component would be the way to go, how do I access the raw row with its delimiters inside the script task?
SOLUTION:
I ended up going with a modified version of Holder's answer as I found that TOKENCOUNT() will not count null values per this SO answer. When two delimiters are not separated by a value, it will result in an incorrect count (at least for my purposes).
I used the following expression instead:
LEN(EntireRow) - LEN(REPLACE(EntireRow, "|", ""))
This results in the correct count of delimiters in the row, regardless of whether there's a value in a given field or not.
My suggestion is to use Derrived Column to do your test
And then add a Conditional Split to decide if you want to insert the rows or not.
Something like this:
Use the TokenCount function in the Derrived Column box to get number of columns like this: TOKENCOUNT(EntireRow,"|")

Processing Interactive Grid manually through PL/SQL and keeps throwing out an error

Used this site https://community.oracle.com/thread/3937159?start=0&tstart=0 to learn how to manually process interactive grids. I got it to work on a small table with 3 columns, but when I tried to get it to work for a bigger table, it keeps throwing this error:
PL/SQL: numeric or value error: character string buffer too small for.
I tried only updating 1 column and converting the datatype to the correct one, and it is not going away.
this message usually means you're trying to store 'AAAA' into a column that only accepts 1, 2 or 3 chars, like varchar2(3).
Make sure your columns have a proper limit size for the data you're processing.

Word wrap issues with SSIS Flat file destination

Background: I need to generate a text file with 5 records each of 1565 character length. This text file is further used to feed the data to a software.
Hence, they are some required fields and optional fields. I created a query with all the fields added together to get one single field. I populated optional fields with a blank.
For example:
Here is the sample input layout for each fields
Field CharLength Required
ID 7 Yes
Name 15 Yes
Address 15 No
DOB 10 Yes
Age 1 No
Information 200 No
IDNumber 13 Yes
and then i generated a query for each unique ID with the above fields into a single row which looks like following:
> SELECT Cast(1 AS CHAR(7))+CAST('XYZ' AS CHAR(15))+CAST('' AS CHAR(15))+CAST('22/12/2014' AS
CHAR(10))+CAST('' AS CHAR(1))+CAST(' AS CHAR(200))+CAST('123456' AS CHAR(13))
UNION
SELECT Cast(2 AS CHAR(7))+CAST('XYZ' AS CHAR(15))+CAST('' AS CHAR(15))+CAST('22/12/2014' AS
CHAR(10))+CAST('' AS CHAR(1))+CAST(''AS CHAR(200))+CAST('123456' AS CHAR(13))
Then, I created an SSIS package to produce the output text file through Flat file destination delimited.
Problem:
Even though the flat file is generated as per the desired length(1565). The text file looks differently when the word wrap is ON or OFF. When Word wrap is off , i get the record in single line. If the Word wrap is on, the line is broken into multiple. the length of the record in either case is same.
Even i tried to use VARCHAR + Space in the query instead of CHAR for each field, but there is no success. Its breaking the line for blank fields.
For example: Cast('' as varchar(1)) + Space(200-len(Cast('' as varchar(1)))) for Information field
Question: How do make it into a single line even though the word wrap is ON.
Since its my first post, please excuse me for format of the question
The purpose of word wrap is to put characters on the next line in instances of overflow rather than creating an extremely horizontal scrolling document.
Word wrap is the additional feature of most text editors, word processors, and web browsers, of breaking lines between words rather than within words, when possible.
Because this is what word wrap is there's nothing you can do to change its behavior. What does it matter anyway? The document should still be parsed as you would expect. Just don't turn word wrap on.
As far as I'm aware, having word wrap on or off has no impact on the document itself, it's simply a presentation option.
Applications parsing a document parse it as if word wrap were off. Something that could throw off parsing is breaks for a new line, but that is a completely different thing from word wrap.

Resources