SSIS Change Data Capture on Table with Child Relations

SSIS Change Data Capture on Table with Child Relations - sql-server

I am new to SSIS and am having a hard time finding information on this. My source is a SQLDB and my Target is a SQLDB
In my source I have a Table with a FK reference to a different table. I want to Extract the Parent with Child from Source and Load them into a single flat table in my Target.
Select * From Table1 Left Join Table2 on Table1.Table2Id = Table2.Id
Within my Dataflow i have an OLE DB Source where i am using the DataAccessMode of SQL Command and using the Query Above.
However, I am looking to do Change Data Capture so i can track Incremental changes and not do wipe and loads. From My Understanding the Change Data Capture Checks is for Table changes. So if i was to set it up on Table 1, it would only track the changes for Table1. How would I be able to import Table1 and Table2 into the flat table in my Target with Change Data Capture?

Related

upsert in multiple large tables using ssis

I have 40 tables having different structure in one of DB on one server that is being updated by data provider.
I want to create a SSIS package that would pull data from that data provider DB and insert ,update or delete (merge) data in to development ,Test,UAT and prod DBs.
The tables are having 1m- 3m rows and 20-30 columns each and all the DBs are on SQL Server platform and are on different servers.
The business requirement is to load data everyday on a particular time and have to use SSIS for this. I am new to SSIS and want your suggestions to create better design.

I don't know about SSIS.
There are packaged solutions to synch databases.
In general with just TSQL
delete
update
insert
TSQL
delete
tableA a
where not exists (select 1 from tableB b where b.PK = a.PK)
update A
set ...
from TableA a
join TableB b
on a.PK = b.PK
insert into TableA (columns)
select columns
from tableB b
where not exists (select 1 from tableA a where b.PK = a.PK)

It's a very broad question. I can help you with pointers. Follow them and ask questions when you get stuck. I'll be telling you for 1 table. You will have to do parallel for others:
Create a source OLEDB connection and destination OLEDB connection. This will be used to copy from source to staging tables where actual Data warehouse sits.
Create a Data flow task. Simply copy source db to staging tables. You'll have to implement incremental loading logic. For instance, store the last source.Id and load data from that Id onwards to latest.
Once you've data in staging, create another Data flow task where you'll have to apply Lookup transformation to insert and update data, while loading in destination table.
Deletion won't work here, so you'll have to apply deletion in next step(preferably via execute sql task)
Above steps are the guidelines. You'll be having multiple sequence containers working in parallel, each having above DFT, working on separate tables.

MS Access VBA/SQL: Import CSV and Compare to Existing Records

I need to import sales data from an external source into an Access database. The system that generates the sales reports allows me to export data within a specified date range, but that data may change due to updates, late reported data, etc. I want to loop through each line of the CSV and see if that row already exists. If it does, ignore it; if it doesn't add a new record to the sales table.
Unless I'm misunderstanding it, I don't believe I can use DoCmd.TransferText as the data structure does not match the table I'm importing it to - I am only looking at importing several of the columns in the file.
What is my best option (1) access the data within my file to loop through, and (2) to compare the contents of a given row against a given table to see if it already exists?

Consider directly querying the csv file with Access SQL, selecting needed columns and run either of the NOT IN / NOT EXISTS / LEFT JOIN ... NULL queries to avoid duplicates.
INSERT INTO [myTable] (Col1, Col2, Col3)
SELECT t.Col1, t.Col2, t.Col3
FROM [text;HDR=Yes;FMT=Delimited(,);Database=C:\Path\To\Folder].myFile.csv t
WHERE NOT EXISTS
(SELECT 1 FROM [myTable] m
WHERE t.Col1 = m.Col1); -- ADD COMPARISON FIELD(S) IN WHERE CLAUSE

How to compare two tables in SSIS? (SQL Server)

I am creating an SSIS package that will compare two tables and then insert data in another table.
Which tool shall I use for that? I tried to use "Conditional Split" but it looks like it only takes one table as input and not two.
These are my tables:
TABLE1
ID
Status
TABLE2
ID
Status
TABLE3
ID
STatus
I want to compare STATUS field in both tables. If Status in TABLE1 is "Pending" and in TABLE2 is "Open" then insert this record in TABLE3.

If your tables are not large you can use a Lookup transformation with Full Cache, but I wouldn't recommend it because if your tables grow you will run into problems. I know I did.
I would recommend Merge Join transformation. Your setup will include following:
two data sources, one table each
two Sort transformations, as Merge Join transformation needs sorted input; I guess you need to match records using ID, so this would be a sort criteria
one Merge Join transformation to connect both (left and right) data flows
one Conditional Split transformation to detect if there are correct statuses in your tables
any additionally needed transformation (e.g. Derived Column to introduce data you have to insert to your destination table)
one data destination to insert into destination table
This should help, as the article explains the almost exact problem/solution.

I managed to do it by using Execute SQL Task tool and writing the following query in it.
INSERT INTO TABLE3 (ID, Status)
SELECT * FROM TABLE1 t1, TABLE2 t2
WHERE t1.ID = t2.ID and t1.status = 'Pending' and t2.status = 'Open'

i think so this is what you are looking for.?
In your case if both the tables are Sql tables then follow the steps below
Drag dataflow task
Edit dataflow task add Oledb source and in sql command paste the below sql
code
add oledb destination and map the columns with table3
sql code
select b.id,b.status
from table1 a
join table2 b on a.id = b.id
where a.status = 'Pending' and b.status = 'open'
I think this will work for you.

How to use TClientDataSet with SQL Server view? (Or alternative)

In my form I have TADOQuery,TDataSetProvider,TClientDataSet,TDataSource,TDBGrid linked.
AdoQuery use SQL Server view to query data
AdoQuery.SQL:
Select * from vu_Name where fld=:fldval
Vu_Name:
SELECT * FROM t1 INNER JOIN t2 ON t2.fld1 = t2.fld1
in my dbgrid, only columns in table t1 are editable.(only t1 need to update )
What are the possible (fastest) ways to apply updates back to the server?
ClientDataSet.ApplyUpdates(0); // not working
Thank you.

TDataSetProvider has an event OnGetTableName where you should set the TableName parameter to t1. Thus the provider knows where to store the changed values.
You have to make sure that only fields of t1 are changed as TDataSetProvider will only update one table. Of course you can have different table names for different calls to ApplyUpdates. You can find out about changed fields in the DataSet parameter.
If you want to update more than one table you have to implement OnUpdateData, which gives you all of the freedom with all of the responsibility.

Create SQL trigger query to dump all column changes into single variable

For some background... I have a collection of tables, and I would like a trigger on each table for INSERT, UPDATE, DELETE. SQL Server version is SQL 2005.
I have an audit table Audit that contains a column called Detail. My end goal is to create a trigger that will get the list of columns of its table, generate a dynamic select query from either Inserted, Updated, or Deleted, do some string concatenation, and dump that value into the Detail column of Audit.
This is the process I was thinking:
Get columns names in table for sys.columns
Generate dynamic sql SELECT query based on column names
Select from Inserted
foreach row in results, concatenate column values into single variable
Insert variable data into Detail column
So, the questions:
Is this the best way to accomplish what I'm looking to do? And the somewhat more important question, how do I write this query?

You could use FOR XML for this, and just store the results as an XML document.
SELECT *
FROM Inserted
FOR XML RAW
will give you attibute-centric xml, and
SELECT *
FROM Inserted
FOR XML PATH('row')
will give you element-centric xml. Much easier than trying to identify the columns and concatenate them.