I am going to keep two databases sync with each other by using CDC component in SSIS (For VS 2013). So far I have created a SSIS Package which transfers all data from one to the other. However, I have some difficulties with incremental loading of records.
The problem is I have about 30 tables in my database. Although I can incrementally move data for a single table like the following, but it is really time-consuming to create this process for each individual table. Is there any workaround to do the same procedure for all tables automatically?
Figure 1: Incremental loading for one table.
Figure 2: For each table I have to write Update and Delete commands manually as depicted.
Related
Hi I am using SSIS (MSSQL) to copy data between multiple tables. This has been working fine up until recently when the S.A.P. team keeps updating the schema of the tables without telling me.
I have multiple tables that they continue to add columns to; this in turn makes my SSIS job of copying the data across fail.
Is there a way in SSIS that I can look at the source table and adjust my destination table to reflect the changes on the fly?
I'm quite new at SSIS and don't mind running a script out of the GUI but wondered if this was an option within the GUI I'm already familiar with.
So in short, can I in SSIS allow for new columns being added to source tables and update my destination tables automatically to stop my jobs failing
(Oh and map source to destination tables automatically)?
You'll have to include the new columns in the data flow, i.e. source and destination (include and map them). So basically you CANNOT automate what you're looking for in SSIS. Hope it helps.
Look into BiML Script, which lets you create and execute SSIS packages dynamically based on the meta data available at run time.
During our SQL Server database deployments, we create a temporary table which contains the new desired state of data for a particular table. We then merge the temp table into the target table (we actually use individual insert, update and delete statements, but that's probably not relevant). The inserts/updates/deletes performed are captured and written out to a log.
We would like to be able to report on what changes would be applied by a deployment, without actually applying them. This is currently done by rolling back the transaction at the end of the above process. This doesn't feel particularly great though.
Now what we are thinking of doing is, instead of performing the changes and rolling them back, we will generate a migration script for the table (generate some SQL code that performs the necessary inserts, updates and deletes). If we want to do the actual deployment, this code will be dynamically executed. If not, the code will just be printed to a log.
It shouldn't take long to put together some code which can generate migration scripts for two specified tables, but I first wanted to verify that there isn't already an existing tool which can do this?
Searching on Google, I can find lots of talk about migrating whole databases, but nothing about generating a data migration script to effectively merge one table into another.
So my question is, does anyone know of such a tool?
There are several data compare tools like:
SQL Data Compare from Red Gate
SQL Server Data Tools
dbForge Data Compare from Devart
Is that what you're looking for?
I'm currently trying to build a data flow in SSIS to select all records from a mapping table where an ID column exists in the related Item table. There are two complications:
The two tables are currently in different databases on different servers.
The databases are in Azure, for which I've read Linked Servers are not supported.
To be more clear, the job to migrate data from Staging environment to Production. I only want to push lookup records into prod if the associated Item IDs are in there. Here's some psudo-TSQL to give a clear goal of what I'm trying to achieve:
SELECT *
FROM [Staging_Server].[SourceDB].[dbo].[Lookup] L
WHERE L.[ID] IN (
SELECT P.[Item]
FROM [Production_Server].[TargetDB].[dbo].[Item] P
)
I haven't found a good way to create this in SSIS. I think I've created a work-around that involves sorting both tables and performing a merge join, but sorting both sides is an unnecessary hit on performance. I'm looking for a more direct and intuitive design for this seemingly simple data flow.
Doing this in a data flow, you'd have your Source query, sans filter, fed into a Lookup Component which is the subquery.
The challenge with this is SSIS is likely on-premises so that means you are going to pull all of your data out of Stage Azure to the server running SSIS and push it back to the Prod Azure instance.
That's a lot of network activity and as I'm reading the Azure pricing guide, I guess as long as you have the appropriate DTUs, you'd be fine. Back in the day, you were charged for Reads and not Writes so the idiom was to just push all your data to target server and then do the comparison there, much as ElendaDBA mentions. Only suggestion I'd make on the implementation is to avoid temporary tables or ad-hoc creation/destruction of them. Just implement as a physical table and truncate and reload prior to transmission to production.
You could create a temp table on staging server to copy production data into. Then you could create a query joining those two tables. After SSIS package runs, you could delete the temp table on staging server
We have Restaurant Inventory Control system that uses SQL Server 2008 R2.
It takes a very long time to add all the definition data: stock items, yields, packsizes, recipes, categories etc. So, our clients have asked if they can upload it from Excel.
Before I just jump in and start, I want to find out if there is a best practice way to do this.
I know all the tools: SSIS, stored procedures etc. But I'm looking for advice/resources that can help with the design process. How best to setup the spreadsheet, validate the data, create the child/parent relationships etc.
This must be a fairly common project -- so it must have a standard design/approach and that's what I'm looking for.
I think the design will depend on the technologies you're most comfortable with. If you're comfortable with SSIS and stored procedures, this is the general pattern I would use:
Excel Template - I wouldn't spend too much time on this, add the headers and sheets necessary for the tables. You can lock down certain things and/or implement rules, but most of your validation would be done in stored procs.
SSIS - Have a package that loads the excel data into Staging tables, have rows with errors get added to an error log to be presented to the user along with the validation issues from the stored procedures.
Staging Tables - Have one staging table per sheet/production table, have an ExecutionId column in each staging table to allow parallel processing. Allow all columns to be NULL so you can get the data in the staging tables or set the proper null conditions and have SSIS redirect these rows on error. Don't have any primary key / foreign key relationships in the staging tables, these can be validated in the stored procedure
Stored Procedures - Validate the staging data, any issues found would be added to the error log to be presented to the user or person performing the import. If there are no issues, import the data into the production tables. If there is existing data in the production tables, you could do a comparison and update if applicable.
One question though lets say publisher database had 100 tables and I use Transactional Replication to move the data from those 100 tables to Subscriber Database that would be fine.
But lets say I don't want the 100 tables but i want to create 3-4 Views which contain the key information I want from those 100 tables. How Would I achieve this.
1) Firstly I guess the views need to be created on the publisher database
2) Secondly Do i need to create then 3/4 Tables in the Subscriber database which have the same columns as the view from publisher database.
3) What sort of replication or maybe even SSIS or something to move the data from the publisher view to subscriber database
Replication probably wouldn't be viable or as performant an option as creating a SSIS package for transferring data from those views and into the small set of tables in the remote database. SSIS's strongest feature is it's ability to transfer large volumes of data quickly from a source and into a destination. With a little upkeep, you could potentially just transfer the differences between the two databases at some scheduled interval and have a fairly flexible solution.
SSIS will be the better solution. You would create the tables on your target database. Then, you can create the SSIS pacakge(s) to populate the target tables.
SSIS can use queries on tables or views. And, it can also execute a stored procedure to retrieve the data.