How to compare two tables in SSIS? (SQL Server) - sql-server

I am creating an SSIS package that will compare two tables and then insert data in another table.
Which tool shall I use for that? I tried to use "Conditional Split" but it looks like it only takes one table as input and not two.
These are my tables:
TABLE1
ID
Status
TABLE2
ID
Status
TABLE3
ID
STatus
I want to compare STATUS field in both tables. If Status in TABLE1 is "Pending" and in TABLE2 is "Open" then insert this record in TABLE3.

If your tables are not large you can use a Lookup transformation with Full Cache, but I wouldn't recommend it because if your tables grow you will run into problems. I know I did.
I would recommend Merge Join transformation. Your setup will include following:
two data sources, one table each
two Sort transformations, as Merge Join transformation needs sorted input; I guess you need to match records using ID, so this would be a sort criteria
one Merge Join transformation to connect both (left and right) data flows
one Conditional Split transformation to detect if there are correct statuses in your tables
any additionally needed transformation (e.g. Derived Column to introduce data you have to insert to your destination table)
one data destination to insert into destination table
This should help, as the article explains the almost exact problem/solution.

I managed to do it by using Execute SQL Task tool and writing the following query in it.
INSERT INTO TABLE3 (ID, Status)
SELECT * FROM TABLE1 t1, TABLE2 t2
WHERE t1.ID = t2.ID and t1.status = 'Pending' and t2.status = 'Open'

i think so this is what you are looking for.?
In your case if both the tables are Sql tables then follow the steps below
Drag dataflow task
Edit dataflow task add Oledb source and in sql command paste the below sql
code
add oledb destination and map the columns with table3
sql code
select b.id,b.status
from table1 a
join table2 b on a.id = b.id
where a.status = 'Pending' and b.status = 'open'
I think this will work for you.

Related

Need help on a sql query to get data from multiple tables

I have a couple of tables that have data in them that I am looking to get information from. Here is the rundown....In table 1 I have bunch of columns that I am pulling data from, one of the columns is a user ID (which is a number)that was the last userID to modify a record. In table 2 I want to pull in the name of that user based on the ID that is pulled from the other table (this table has both the userID and the username).
so my final query would have the columns in table 1 as well as the username from table 2 to show that was the user to last edit the record. I assume this has to be done in a nested select statement but for the life of me I cannot come up with the correct syntax.
Can anyone help me out?
Thanks
Jeff
Yes, you need a very basic join that link both tables together.
Select t1.UserID,
t2.UserName
FROM table1 t1 INNER JOIN
table2 t2 ON t1.userid=t2.userid
select t1.*, t2.{username} from table1 as t1
join table2 as t2 on t1.{userId}=t2.{userid};
change {username} with the actual column name of user
similarly {userId} with appropriate column name in tables.
Hope it helps you.
this is standard inner join query, to learn more consider reading: http://www.w3schools.com/sql/

update and join tables without a where

I'm trying to get the relationships of two databases straighten out and I have an update clause like this
update dbo1.table1
set Relationship_column = table2.id
from dbo1.table2 as table2
inner join dbo2.table3 as Table3
on table2.number = Table3.number
Basically if I join two tables and update a 3rd table based on the results. This works on SSIS because you can just remap the entire table, I dont want to do that.
Of course this does not work in SQL because I need a WHERE clause else each record is repeated on all columns.Is there a work around to this??

Proper way to filter a table using values in another table in MS Access?

I have a table of transactions with some transaction IDs and Employee Numbers. I have two other tables which are basically just a column full of transactions or employees that need to be filtered out from the first.
I have been running my query like this:
SELECT * FROM TransactionMaster
Where TransactionMaster.TransID
NOT IN (SELECT TransID from BadTransactions)
AND etc...(repeat for employee numbers)
I have noticed slow performance when running these types of queries. I am wondering if there is a better way to build this query?
If you want all TransactionMaster rows which don't include a TransID match in BadTransactions, use a LEFT JOIN and ask for only those rows where BadTransactions.TransID Is Null (unmatched).
SELECT tm.*
FROM
TransactionMaster AS tm
LEFT JOIN
BadTransactions AS bt
ON tm.TransID = bt.TransID
WHERE bt.TransID Is Null;
That query should be relatively fast with TransID indexed.
If you have Access available, create a new query using the "unmatched query wizard". It will guide you through the steps to create a similar query.

UPDATE query from OUTER JOINed tables or derived tables

Is there any way in MS-Access to update a table where the data is coming from an outer joined dataset or a derived table? I know how to do it in MSSQL, but in Access I always receive an "Operation must use updateable query" error. The table being updated is updateable, the source data is not. After reading up on the error, Microsoft tells me that the error is caused when the query would violate referential integrity. I can assure this dataset will not. This limitation is crippling when trying to update large datasets. I also read that this can supposedly be remedied by enabling cascading updates. If this relationship between my tables is defined in the query only, is this a possibility? So far writing the dataset to a temp table and then inner joining that to the update table is my only solution; that is incredibly clunky. I would like to do something along the lines of this:
UPDATE Table1
LEFT JOIN Table2 ON Table1.Field1=Table2.Field1
WHERE Table2.Field1 IS Null
SET Table1.Field1= Table2.Field2
or
UPDATE Table1 INNER JOIN
(
SELECT Field1, Field2
FROM Table2, Table3
WHERE Field3=’Whatever’
) AS T2 ON Table1.Field1=T2.Field1
SET Table1.Field1= T2.Field2
Update Queries are very problematic in Access as you've been finding out.
The temp table idea is sometimes your only option.
Sometimes using the DISTINCTROW declaration solves the problem (Query Properties -> Unique Records to 'Yes'), and is worth trying.
Another thing to try would be to use Aliases on your tables, this seems to help out the JET engine as well.
UPDATE Table3
INNER JOIN
(Table1 INNER JOIN Table2 ON Table1.uid = Table2.uid)
ON
(Table3.uid = Table2.uid)
AND
(Table3.uid = Table1.uid)
SET
Table2.field=NULL;
What I did is:
1. Created 3 tables
2. Establish relationships between them
3. And used the query builder to update a field in Table2.
There seems to be a problem in the query logic. In your first example, you LEFT JOIN to Table2 on Field1, but then have
Table2.Field1 IS NULL
in the WHERE clause. So, this limits you to records where no JOIN could be made. But then you try and update Table 1 with data from Table2, despite there being no JOIN.
Perhaps you could explain what it is you are trying to do with this query?

UPSERT in SSIS

I am writing an SSIS package to run on SQL Server 2008. How do you do an UPSERT in SSIS?
IF KEY NOT EXISTS
INSERT
ELSE
IF DATA CHANGED
UPDATE
ENDIF
ENDIF
See SQL Server 2008 - Using Merge From SSIS. I've implemented something like this, and it was very easy. Just using the BOL page Inserting, Updating, and Deleting Data using MERGE was enough to get me going.
Apart from T-SQL based solutions (and this is not even tagged as sql/tsql), you can use an SSIS Data Flow Task with a Merge Join as described here (and elsewhere).
The crucial part is the Full Outer Join in the Merger Join (if you only want to insert/update and not delete a Left Outer Join works as well) of your sorted sources.
followed by a Conditional Split to know what to do next: Insert into the destination (which is also my source here), update it (via SQL Command), or delete from it (again via SQL Command).
INSERT: If the gid is found only on the source (left)
UPDATE If the gid exists on both the source and destination
DELETE: If the gid is not found in the source but exists in the destination (right)
I would suggest you to have a look at Mat Stephen's weblog on SQL Server's upsert.
SQL 2005 - UPSERT: In nature but not by name; but at last!
Another way to create an upsert in sql (if you have pre-stage or stage tables):
--Insert Portion
INSERT INTO FinalTable
( Colums )
SELECT T.TempColumns
FROM TempTable T
WHERE
(
SELECT 'Bam'
FROM FinalTable F
WHERE F.Key(s) = T.Key(s)
) IS NULL
--Update Portion
UPDATE FinalTable
SET NonKeyColumn(s) = T.TempNonKeyColumn(s)
FROM TempTable T
WHERE FinalTable.Key(s) = T.Key(s)
AND CHECKSUM(FinalTable.NonKeyColumn(s)) <> CHECKSUM(T.NonKeyColumn(s))
The basic Data Manipulation Language (DML) commands that have been in use over the years are Update, Insert and Delete. They do exactly what you expect: Insert adds new records, Update modifies existing records and Delete removes records.
UPSERT statement modifies existing records, if a records is not present it INSERTS new records.
The functionality of UPSERT statment can be acheived by two new set of TSQL operators. These are the two new ones
EXCEPT
INTERSECT
Except:-
Returns any distinct values from the query to the left of the EXCEPT operand that are not also returned from the right query
Intersect:-
Returns any distinct values that are returned by both the query on the left and right sides of the INTERSECT operand.
Example:- Lets say we have two tables Table 1 and Table 2
Table_1 column name(Number, datatype int)
----------
1
2
3
4
5
Table_2 column name(Number, datatype int)
----------
1
2
5
SELECT * FROM TABLE_1 EXCEPT SELECT * FROM TABLE_2
will return 3,4 as it is present in Table_1 not in Table_2
SELECT * FROM TABLE_1 INTERSECT SELECT * FROM TABLE_2
will return 1,2,5 as they are present in both tables Table_1 and Table_2.
All the pains of Complex joins are now eliminated :-)
To use this functionality in SSIS, all you need to do add an "Execute SQL" task and put the code in there.
I usually prefer to let SSIS engine to manage delta merge. Only new items are inserted and changed are updated.
If your destination Server does not have enough resources to manage heavy query, this method allow to use resources of your SSIS server.
We can use slowly changing dimension component in SSIS to upsert.
https://learn.microsoft.com/en-us/sql/integration-services/data-flow/transformations/configure-outputs-using-the-slowly-changing-dimension-wizard?view=sql-server-ver15
I would use the 'slow changing dimension' task

Resources