I need to sync data from several tables in a legacy SQL Server db (source) to a single table in a Postgres db (target). The schema of the source db is absurd, so the query to select the data takes a very long time to run. I'm planning to create an indexed view in the source db, and then somehow sync that indexed view to the Postgres table.
Right now, I simply have a scheduled task that drops the Postgres table (target) and then recreates it from scratch by running the complex query in the source db. This was quick to set up, and it ensures that changes in the source db always eventually make it to the target db, but recreating the table every few hours is (understandably) very slow and expensive. I need a way to replicate ongoing changes (only the new/updated data) from the source view to the target table. Is there a (relatively) simple way to do this?
I'm somewhat familiar with CDC, but I understand that CDC cannot be used on a view, so I don't believe that's an option. Adding "updated at" timestamps to the source tables is not an option, so I can't use that approach. I could add a hash column to the source tables, or maybe add a hash column to the view, so that's an option if that would work. Is there an existing tool/service that does what I need?
If you want to view SQL Server DB data in PostgreSQL, then you can also tds_fdw.
https://github.com/tds-fdw/tds_fdw
Also, there are some third-party tools which could help you to achieve your goal, for example, SymmetricDS
http://www.symmetricds.org/about/overview
Related
I've created a Data Fusion replication job to replicate some tables on a test database.
It works well at the beginning if I don't change the tables schema. But I've added a new column and that column is ignored from the replication job. I guess that if I create a new table, even that table would be ignored.
Is there a way to include schema updates (new table, update column field, new column etc...) inside an already running Data Fusion replication job?
I guess a possible solution would be to stop the currently running job and create a new one including new tables, new columns etc... but I'd like to avoid that a new job would replicate all the database again.
Any possible solution?
Unfortunately, Data Fusion Replication for SQL server currently does not support DDL propagation during runtime; you will need to delete and recreate the replicator pipeline in order to propagate any changes to schema to the new BigQuery table.
One way to avoid replicating existing data with DDL change is that , you can manually modify the BigQuery table schema (But BigQuery also has limited support for schema changes) and create a new replication job and disable replicating existing data(there is an option that let you choose whether to replicate existing data, default is true)
I have a production database of 20 TB data. We migrated our database from Oracle to SQL Server. Our old application was based on a Cobol based platform. After migrating to SQL Server indexes are giving good results.
I am creating a schema with new set of indexes without any data. Now I want to migrate only the data.
Import/Export utility will take load log time and will fill up the log files also. Is there any other alternative of this ?
My advice would be:
Set the recovery model to simple. See here.
Remove the indexes.
Batch insert the rows or use select into (this minimizes logging).
Re-create the indexes.
I admit that I haven't had to do this sort of thing in a long time in SQL Server. There may be other methods that are faster -- such as backing up a table space/partition and restoring it in another location.
You may use bcp utility to import/export data. For full details see here https://learn.microsoft.com/en-us/sql/tools/bcp-utility?view=sql-server-2017
Could you please suggest the easiest way to programmatically (not via UI) generate a script to migrate specific tables (schema, constraints, indexes) to another database on a separate server.
I am aware of replication, SSIS, generate scripts feature, backup-restore approach and SQL Import/export window. However, all these approaches require at least some kind of UI interaction or don' allow to copy constraints or don't allow to migrate only part of data.
Database where I will be putting the data will be in sync with main DB, so it is possible to just wipe-off existing data in it and overwrite with schema and data from main DB.
From Comment: I need to migrate only part of DB: specific tables with their foreign key/primary key constraints, indexes AND data from these tables
as per my understanding i hope this will help you
Click Next
Choose your Location
USE DATABASE: FALSE will help you to execute script in your New DB which you created in your new server basically it will not generate Create DB script
Read carefully Table View/Option whatever you need please make it true
Click Next Pickup script file from your location and run on your new server
I'm upgrading my SQL Server 2000 database to SQL Server 2008 R2. I want to make use of Change Data Capture feature. Im my existing application I have the similar functionality, but I'm using triggers and historical table with Hst_ prefix with almost similar schema as the original tables.
My question is: is there any way to migrate my data from Hst_ tables to the tables used by CDC feature?
I was thinking of doing that like this:
I have the table Cases.
I'm using my custom historization mechanism , so I also have also three triggers (on insert, update and delete) and a twin table Hst_Cases.
Now I'm enabling CDC on table Cases
CDC creates function, which returns historical data (fn_cdc_get_all_changes_dbo_Cases) and also a system table, which actually holds the data (cdc.dbo_Cases_CT).
I could insert data from Hst_Cases to cdc.dbo_Cases_CT, but I have the following problems:
I don't know how to get __$start_lsn and __$seqval.
It is difficult to figure out __$update_mask (I have to compare each two rows).
Is there the only way to do that? I want to avoid the situation then I join "new" historical data with the "old" historical data from Hst_ tables.
Thanks!
You typically don't want to use the capture tables to store long-term change data, it would be better to have an SSIS package move the capture data to permananent tables. If you do use them, I think if you ever have to restore your database, they'll be empty after restore unless you use the KEEP_CDC option when restoring. You'll also need to disable the job that automatically purges the capture tables.
If you create your own tables for storage, you can omit the lsn and mask fields.
One question though lets say publisher database had 100 tables and I use Transactional Replication to move the data from those 100 tables to Subscriber Database that would be fine.
But lets say I don't want the 100 tables but i want to create 3-4 Views which contain the key information I want from those 100 tables. How Would I achieve this.
1) Firstly I guess the views need to be created on the publisher database
2) Secondly Do i need to create then 3/4 Tables in the Subscriber database which have the same columns as the view from publisher database.
3) What sort of replication or maybe even SSIS or something to move the data from the publisher view to subscriber database
Replication probably wouldn't be viable or as performant an option as creating a SSIS package for transferring data from those views and into the small set of tables in the remote database. SSIS's strongest feature is it's ability to transfer large volumes of data quickly from a source and into a destination. With a little upkeep, you could potentially just transfer the differences between the two databases at some scheduled interval and have a fairly flexible solution.
SSIS will be the better solution. You would create the tables on your target database. Then, you can create the SSIS pacakge(s) to populate the target tables.
SSIS can use queries on tables or views. And, it can also execute a stored procedure to retrieve the data.