Converting data in column in SSIS - sql-server

I'm writing an SSIS package to load data from a .csv into a db.
There's a column in the csv file that is supposed to have a count, but the records sometimes have text, so I can't just load the data in as an integer. It looks something like this:
I want the data to land in the db destination as an integer instead of a string. I want the transformation to change any text to a 1, any blank value to a 1, and leave all the other numbers as-is.
My attempts have so far included using the Derived Column functionality, which I couldn't get the right expression(s) for it seems, and creating a temp table to run a sql query through the data, which kept breaking my data flow.

There are three approaches you can follow.
(1) Using a derived column
You should add a derived column with the following expression to check if the values are numeric or not:
(DT_I4)[count] == (DT_I4)[count] ? [count] : 1
Then in the derived column editor, go to the error output configuration and set the error handling event to Ignore failure.
Now add another derived column to replace null values with 1 :
REPLACENULL([count_derivedcolumn],1)
You can refer to the following article for a step-by-step guide:
Validate Numeric or Non-Numeric Data in SQL Server Integration Services without the Script Task
(2) Using a script component
If you know C# or Visual Basic.NET, you can add a script component to check if the value is numeric and replace nulls and string values with 1
(3) Update data in SQL
You can stage data in its initial form into the SQL database and use an update query to replace nulls and string values with 1 as follows:
UPDATE [staging_table]
SET [count] = 1
WHERE [count] IS NULL or ISNUMERIC([count]) = 0

Related

Scientific Notation Issue while loading data from Excel (xlsx) file to SQL Tables via SSIS

I'm loading data from excel file (.xlsx) to SQL table using SSIS package. For one column it's adding scientific notations in the data, it's already there in the excel file. But it's actual value is not loading to SQL table. I tried multiple option of derived columns, expressions etc. But I couldn't get the proper value.
This column has data of numeric and nvarchar values. Below is the example of the column.
ApplicationNumber
1.43E+15
923576663
25388447
TXY020732087
18794588
TXAP0000140343
**Actual Values -**
ApplicationNumber
1425600000000000
923576663
25388447
TXY020732087
18794588
TXAP0000140343
There is no issue with data coming from Business to Excel. But how we can handle this scenario in SSIS ?
I also tried (DT_I8)ApplicationNumber==(DT_I8)ApplicationNumber, But it giving values for the above
1.43E+15 -> 1.430000000000000 and not the 1425600000000000
One thing you can do is set the output in advanced editor of the excel source as decimal with a large scale, 20 digits for example:
UPDATE
to consider also strings in the same column you may need to redirect the error output as these will throw a conversion error:
in advanced editor:
Default output:
Error output:
Then you can update your database from both the default and the error output.
I faced this problem recently using SSIS too.
1- Change the column type in Excel to "Number"
2- Remove the decimal positions.
3- Upload the file using SSIS

Pentaho: Cannot import Boolean Value to table with PostgreSQL Bulk Loader?

I am currently trying to import some data (from a csv file) to a postgreSQL database. For this, I am using the CSV file input step to import the csv file into Kettle. Second, I am using the Modified Java Script Value step for altering some values and I am also adding a new column named VALID. This column should always be true. I added the column VALID to the fields in the lower half of the step window. My step looks like following:
To import the data from kettle to the PostgreSQL database table, I am using the PostgreSQL Bulk Loader (as there are millions of rows to import). This steps looks like following:
As you can see in this image, the table column name is valid and the stream field is VALID (which is coming from the Javascript Value step). Both boolean. Should be working. But instead, I am getting the following error message if I run the transformation:
2018/02/12 14:52:50 - PostgreSQL Bulk Loader.0 - Caused by:
org.postgresql.util.PSQLException: ERROR: invalid input syntax for type
boolean: "1.0"
Wobei: COPY adac_test, line 1, column valid: "1.0"
Any suggestions on how to fix this?
cast "valid" as string. In postgres the boolean values are "kind of" strings, not really but boolean values on postgres can be inserted as string:
https://www.postgresql.org/docs/9.6/static/datatype-boolean.html
based on this documentation , this should work:
var VALID = 't';
and select type string instead boolean below the code window

How to pass result of Conditional Split to variable?

I have a flat file and used conditional split to filter the record into a single row. For example, RecordType == "2" retrieves single row with record having multiple columns say A,B,C,D and E. I want to pass the result of Column C value to a variable. And then to use it to update the table like:
Update tablename
Set A = that variable
Where A is null
Could you please help me in find out the solution.
I would not use the variable but use a Ole DB Command object.
You set the connection.
Then add your SQL from above:
Update tablename Set A = ? Where A is null
The map to Col C.
However, what I might guess you are trying to do is add a column to your other record set that has the detail but no key.
I would use a script component to do this:
Similar to this example:
Importing Grouped Report Data to Database

Redirect NULL or blank values from Flat File

I am importing records from a flat file source to a SQL table which has 4 columns which do not accept NULL values. And what I would like to do is redirect the records which contain a NULL or blank value for the particular 4 fields to a flat file destination.
Below you can see the table configuration:
And here is a sample from my flat file source where I have blanked out the county_code in the first record, the UCN in the second record, and the action_id in the third.
If I run my package as it is currently configured, it errors out due to the constraints:
The column status returned was: "The value violated the integrity constraints for the column.".
So my question is how to I redirect these rows? I think I should do a conditional split, but I am not certain and further I don't know how I would configure that as well. My attempts have been futile so far.
Any suggestions?
Add a Derived Column Transformation after your Flat File Source. There you'll test whether the not nullable columns are null.
For ease of debugging, I would add a flag for each of those columns in question.
null_timestamp (ISNULL(timestamp) || LEN(RTRIM(timestamp)) == 0) ? true : false
An expression like this will determine whether the column from flat file is null or whether the trimmed length is zero.
Once you have your flags tested, then you'd add in a Conditional Split. The conditional split routes rows based on a boolean expression. I would add a Bad Data output to it and use an expression like
null_timestamp || null_country_code || null_etc
Because they are boolean, if we OR the values together if any of those were to be true, then the whole expression becomes true and rows are routed to the bad data path.
I would simply add a conditional split and name the Output accordingly:
Could you load the data to a temp table first, then using 2 separate queries against the temp table either insert to your table, or write out to flat file?

Possible to replace digits in T-SQL

SQL Server 2008 (but have access to higher versions too)
I'm getting a string from another database on the same server. Using the below code i get some data and replace the content
INSERT INTO [DestinationDatabase].[DBO].[Table](ID, XML)
(SELECT ID, REPLACE(XML,'ReferenceID="1234"','PropertyID="2468"')
FROM [SourceDatabase].[DBO].[Customers]
This works as expected but every record has a different ReferenceID so is there a way to remove the current ReferenceID value as in the 4 digits (theres around 1000 records with different values) and replace it with another 4 digit value?
I will get the replacement value from another procedure but at this stage i need to know if it possible to find and strip the 4 digits and replace them.
If you want to use the replace function you can do it like that
REPLACE(XML,'ReferenceID="'+cast(table.field as nvarchar)+'"','ReferenceID="2468"')
REPLACE(XML,'ReferenceID="'+cast(table.field as nvarchar)+'"','ReferenceID="'+cast(table.another_field as nvarchar)+'"')
You can use xml function to do so but it seems like your XML column is not xml data type. is that correct.

Resources