Backup and restore cdc tables - sql-server

Currently working on customer releases for our product. First step is to compare the customer database with our current dev db. Apparently we need to shut off cdc to do this, which drops all of the system cdc tables. Is there any way to backup and restore these tables for after we've finished comparing the two db's?

Apparently we need to shut off cdc to do this.
Completely agree with #DavidBrown that it isn't necessary to disable CDC for comparing the databases.
CDC enabled database simply means that your database captures the data change automatically. It manages that changed data in a separate table therefore you can still compare your database with different database without disabling CDC.
In case if required, you can restore the CDC enabled database by referring this third-party tutorial.
Restoring a SQL Server database that uses Change Data Capture

Related

MS SQL Server restart CDC to read from beginning after cleanup

recently I started working with CDC on MS SQL Server. I have a scenario.
Enabled CDC on a SQL Server
Enalbed CDC on a certain table
Data ingested using debezium connector to kafka
Data has been cleared by cdc cleanup job
Is it possible to run cdc capturing changes once again from beginning ? Like restarting whole CDC process to initial point ?
Kind of, but it you may or may not like it.
It seems like you're looking for "can I get the missing change history back?". The answer is "not really". But you have options.
If you have historic backups of the database that has CDC on it, you could restore those somewhere and grab the CDC data from those. Depending on the size of your database, the configured retention on the CDC data, and the rate of change (i.e. how much change data has been captured), this is probably not a great option. That is, let's say that you have a backup from a month ago and your configured retention is two days. Once you restore the database, it will have the two days of change data from a month ago. You could continue to restore successively newer backups to get to current but that seems like a lot to me.
If you're using CDC to keep a target in sync with the source, you could either restore a backup of the source db somewhere or use a database snapshot of it to grab the initial state of the data and then consume CDC data from that point forward (based on the LSN of the source).
Sounds like you are asking for snapshots, not just CDC... Debezium maintains a history topic, and Kafka Source connectors store offsets as well.
These topics could be modified such that you reset both. For example, here's a blog explaining how it's done with the FileStream source connector.
Otherwise, re-posting a new Debezium connector with a different name should achieve the same effect

Modify table row while export/import table to another database SQL Server

I have issue with my production database, so I want to reproduce the issue on my development database. The DBMS I use is SQL Server 2016 (SP1)
To reproduce the error I'm copying all the data to development database using Export in SQL Server Management Studio.
The production database is running and user still using the database, so there's gonna be insert, update, or even delete row while I'm exporting the data.
What will happen to the modified row(insert, update, or even delete) while I'm exporting the data. Will it be exported to my development database? And why, like how SQL Server handle something like this?
What is the good way to move production database to development database?
And the extreme one, What will happen if table columns modified while export is in process?
EDIT :
I need to mention that the DBMS version on production is higher then development so I can't use backup/restore to move database
What is the good way to move production database to development
database
You should backup your database on the production server and restore it on the dev server.
This will not block user activity on the prod
What will happen to the modified row(insert, update, or even delete)
while I'm exporting the data.
If your insert/update is concurrent but the reading process is already strated on a table, your changes will be blocked. Vice versa, if any DML is already started on the same rows, reading process will wait until modification is committed/rollbacked.
And the extreme one, What will happen if table columns modified while
export is in process?
While you are reading Sch-S lock is held on the table, so no column modification can be done until this lock is released.

db replication vs mirroring

Can anyone explain the differences from a replication db vs a mirroring db server?
I have huge reports to run. I want to use a secondary database server to run my report so I can off load resources from the primary server.
Should I setup a replication server or a mirrored server and why?
For your requirements the replication is the way to go. (asumming you're talking about transactional replication) As stated before mirroring will "mirror" the whole database but you won't be able to query unless you create snapshots from it.
The good point of the replication is that you can select which objects will you use and you can also filter it, and since the DB will be open you can delete info if it's not required( just be careful as this can lead to problems maintaining the replication itself), or create specific indexes for the report which are not needed in "production". I used to maintain this kind of solutions for a long time with no issues.
(Assuming you are referring to Transactional Replication)
The biggest differences are: 1) Replication operates on an object-by-object basis whereas mirroring operates on an entire database. 2) You can't query a mirrored database directly - you have to create snapshots based on the mirrored copy.
In my opinion, mirroring is easier to maintain, but the constant creation of snapshots may prove to be a hassle.
As mentioned here
Database mirroring and database replication are two high data
availability techniques for database servers. In replication, data and
database objects are copied and distributed from one database to
another. It reduces the load from the original database server, and
all the servers on which the database was copied are as active as the
master server. On the other hand, database mirroring creates copies of
a database in two different server instances (principal and mirror).
These mirror copies work as standby copies and are not always active
like in the case of data replication.
This question can also be helpful or have a look at MS Documentation

How to create and maintain SQL Server database replica

I have a SQL server 2012 database on a server, which is a development database.
I want to create an another database on other machine which will be exact replica of the original one and as soon as any changes occur in schema and data it should get migrated to this second database.
I tried the log shipping method but in that case secondary database goes in Restoring mode whereas I want both the database active and functioning at the same time.
Performance or locks doesn't matter.
Any other easy way to do this? a utility that runs periodically automatically would also be great.
With log shipping the databases may be in readonly state most of the time, unless the periods when you run scheduled restore job.
Other options to consider - transactional replication, mirroring with readonly via snapshot or AlwaysOn Avalaibility groups with readable replicas, backup/restore (initial full backup/restore, then differential + trans logs) - but the last option is not for large databases.

Copying Large Amounts of Data to Replicated Database

I have a local SQL Server database that I copy large amounts of data from and into a remote SQL Server database. Local version is 2008 and remote version is 2012.
The remote DB has transactional replication set-up to one local DB and another remote DB. This all works perfectly.
I have created an SSIS package that empties the destination tables (the remote DB) and then uses a Data Flow object to add the data from the source. For flexibility, I have each table in it's own Sequence Container (this allows me to run one or many tables at a time). The data flow settings are set to Keep Identity.
Currently, prior to running the SSIS package, I drop the replication settings and then run the package. Once the package completes, I then re-create the replication settings and reinitialise the subscribers.
I do it this way (deleting the replication and then re-creating) for fear of overloading the server with replication commands. Although most tables are between 10s and 1000s of rows, a couple of them are in excess of 35 million.
Is there a recommended way of emptying and re-loading the data of a large replicated database?
I don't want to replicate my local DB to the remote DB as that would not always be appropriate and doing a back and restore of the local DB would also not work due to the nature of the more complex permissions, etc. on the remote DB.
It's not the end of the world to drop and re-create the replication settings each time as I have it all scripted. I'm just sure that there must be a recommended way of managing this...
Not doing it. Empty / Reload is bad. Try to update the table via merge - this way you can avoid the drop and recreate, which also will result in 2 replicated operations. Load the new data into temp tables on the other server (not replicated), then merge them into the replicated tables. If a lot of data is unchanged, this will seriously reduce the replication load.

Resources