Copying Large Amounts of Data to Replicated Database - sql-server

I have a local SQL Server database that I copy large amounts of data from and into a remote SQL Server database. Local version is 2008 and remote version is 2012.
The remote DB has transactional replication set-up to one local DB and another remote DB. This all works perfectly.
I have created an SSIS package that empties the destination tables (the remote DB) and then uses a Data Flow object to add the data from the source. For flexibility, I have each table in it's own Sequence Container (this allows me to run one or many tables at a time). The data flow settings are set to Keep Identity.
Currently, prior to running the SSIS package, I drop the replication settings and then run the package. Once the package completes, I then re-create the replication settings and reinitialise the subscribers.
I do it this way (deleting the replication and then re-creating) for fear of overloading the server with replication commands. Although most tables are between 10s and 1000s of rows, a couple of them are in excess of 35 million.
Is there a recommended way of emptying and re-loading the data of a large replicated database?
I don't want to replicate my local DB to the remote DB as that would not always be appropriate and doing a back and restore of the local DB would also not work due to the nature of the more complex permissions, etc. on the remote DB.
It's not the end of the world to drop and re-create the replication settings each time as I have it all scripted. I'm just sure that there must be a recommended way of managing this...

Not doing it. Empty / Reload is bad. Try to update the table via merge - this way you can avoid the drop and recreate, which also will result in 2 replicated operations. Load the new data into temp tables on the other server (not replicated), then merge them into the replicated tables. If a lot of data is unchanged, this will seriously reduce the replication load.

Related

Modify table row while export/import table to another database SQL Server

I have issue with my production database, so I want to reproduce the issue on my development database. The DBMS I use is SQL Server 2016 (SP1)
To reproduce the error I'm copying all the data to development database using Export in SQL Server Management Studio.
The production database is running and user still using the database, so there's gonna be insert, update, or even delete row while I'm exporting the data.
What will happen to the modified row(insert, update, or even delete) while I'm exporting the data. Will it be exported to my development database? And why, like how SQL Server handle something like this?
What is the good way to move production database to development database?
And the extreme one, What will happen if table columns modified while export is in process?
EDIT :
I need to mention that the DBMS version on production is higher then development so I can't use backup/restore to move database
What is the good way to move production database to development
database
You should backup your database on the production server and restore it on the dev server.
This will not block user activity on the prod
What will happen to the modified row(insert, update, or even delete)
while I'm exporting the data.
If your insert/update is concurrent but the reading process is already strated on a table, your changes will be blocked. Vice versa, if any DML is already started on the same rows, reading process will wait until modification is committed/rollbacked.
And the extreme one, What will happen if table columns modified while
export is in process?
While you are reading Sch-S lock is held on the table, so no column modification can be done until this lock is released.

db replication vs mirroring

Can anyone explain the differences from a replication db vs a mirroring db server?
I have huge reports to run. I want to use a secondary database server to run my report so I can off load resources from the primary server.
Should I setup a replication server or a mirrored server and why?
For your requirements the replication is the way to go. (asumming you're talking about transactional replication) As stated before mirroring will "mirror" the whole database but you won't be able to query unless you create snapshots from it.
The good point of the replication is that you can select which objects will you use and you can also filter it, and since the DB will be open you can delete info if it's not required( just be careful as this can lead to problems maintaining the replication itself), or create specific indexes for the report which are not needed in "production". I used to maintain this kind of solutions for a long time with no issues.
(Assuming you are referring to Transactional Replication)
The biggest differences are: 1) Replication operates on an object-by-object basis whereas mirroring operates on an entire database. 2) You can't query a mirrored database directly - you have to create snapshots based on the mirrored copy.
In my opinion, mirroring is easier to maintain, but the constant creation of snapshots may prove to be a hassle.
As mentioned here
Database mirroring and database replication are two high data
availability techniques for database servers. In replication, data and
database objects are copied and distributed from one database to
another. It reduces the load from the original database server, and
all the servers on which the database was copied are as active as the
master server. On the other hand, database mirroring creates copies of
a database in two different server instances (principal and mirror).
These mirror copies work as standby copies and are not always active
like in the case of data replication.
This question can also be helpful or have a look at MS Documentation

What is the purpose of tempdb in SQL Server?

I need a clarification about tempdb in SQL Server and need some clarifications on following things
What is the purpose of its?
Can we create a own tempdb and how to make refer the own tempdb to own database?
FROM MSDN
The tempdb system database is a global resource that is available to all users connected to the instance of SQL Server and is used to hold the following:
Temporary user objects that are explicitly created, such as: global
or local temporary tables, temporary stored procedures, table
variables, or cursors.
Internal objects that are created by the SQL Server Database Engine,
for example, work tables to store intermediate results for spools or
sorting.
Row versions that are generated by data modification transactions in
a database that uses read-committed using row versioning isolation
or snapshot isolation transactions.
Row versions that are generated by data modification transactions
for features, such as: online index operations, Multiple Active
Result Sets (MARS), and AFTER triggers.
Operations within tempdb are minimally logged.
This enables transactions to be rolled back. tempdb is re-created every time SQL Server is started so that the system always starts with a clean copy of the database.
Temporary tables and stored procedures are dropped automatically on disconnect, and no connections are active when the system is shut down. Therefore, there is never anything in tempdb to be saved from one session of SQL Server to another. Backup and restore operations are not allowed on tempdb.
TempdB is a system database and we cant create system databases .Tempdb is a global resource for all databases ,which means temp tables,table variables,version store for user databases...all will use tempdb..This is a pretty basic explanation for tempdb.Refer to below links on how it is used for other purposes like database emails,..
https://msdn.microsoft.com/en-us/library/ms190768.aspx
1: It is what it says. A temporary storage. FOr example when you ask for DISTINCT results, SQL Server must remember what rows it already sent you. Same with a temporary table.
2: Makes no sense. Tempdb is not a database but a server thing - ONE TempDB regardless how many database. You can change where it is and how it is (file number, size) but it is never related to one database (except obviously if you only have one database on a SQL Server instance). Having your own Tempdb is NOT how SQL Server works. And while we are at it - no need to ever make a backup (of tempdb). When SQL Server starts, Tempdb is reinitialized as empty.
And, btw., this would be obvious if you would bother with such things as being borderline competent. Which includes for me reading the documentation of every major technology I work with once. You should consider this to be something to adopt because it is the only way to know what you are doing.

Applying Delete Operations on Mirror/Parallel DB

THE SETUP
Two Databases at different locations
Local Server(Oracle): Used for in-house data entry and
processing.
Live Server(Postgres): Used as the DB for a public website.
THE SCENARIO
Daily data insertion/updations/deletions are performed on the Local
DB through out the day.
Later after the end of the day the entire data of the current day is
pushed to the Live DB Server using CSV files and Sql merge.
This updates the Live DB server with the latest updations and new
data inserted.
THE PROBLEM
As the Live server is updated using running batch at the end of the day, the deletion operations do not get applied on the Live server.
Due to this unwanted data also remains at the Live Server causing discrepancy in the data on both servers.
How can the delete operation on local DB server be applied on Live Server along with the Updations and Insertions?
P.S. The entire Live DB is to be restructured so any solution that requires breaking down and restructuring the DB server can also be looked into.
Oracle GoldenGate supports replication from Oracle to PostgreSQL. It would certainly be faster and less error prone than your manual approach since it is all handled at a much lower level by the database.
If for some reason you don't want to do that then you are back to triggers tracking deletes in a table with the PK for the deleted records.
Or you could just switch out the PostgreSQL with Oracle :-)

SQL Server replication for 70 databases with transformation in a small time window

We have 70+ SQL Server 2008 databases that need to be copied from an OLTP environment to a separate reporting server. Once the DB's are copied, we will do some partial data transformation: de-normalization, row level security, etc.
SSRS Reports will be written based on these static denormalized tables and views.
We have a small nightly window for copying and transforming all 70 databases (3 hours).
Currently databases average about 10GB.
Options:
1. Transactional replication:
We would need to create 100+ static denormalized tables on each reporting database.
Doing this for all 70 databases almost reaches our nightly time limit.
As the databases grow we will exceed the time limit. We thought of mixing denormalized tables with views to speed up transformation. But then there would be some dynamic and some static data which is not a solution we can use.
Also with 70 databases using transactional replication we are concerned about bandwidth usage.
2. Snapshot replication:
Copy the entire database each night.
This means we could have a mixture of denormalized tables and views so the data transformation process is quicker.
But the snapshot is a full data copy, so as the DB grows, we will exceed our time limit for completing copy and transformation.
3. Log shipping:
In our nightly window, we could use the log shipping to update the reporting databases, then truncate and repopulate the denormalized tables and use some views.
However, I understand that with log shipping, extra tables and views cannot be added to the subscribing database.
4. Mirroring:
Mirroring is being deprecated, but also the DB is not active for reporting against until failover.
5. SQL Server 2012 AlwaysOn.
We don't have SQL Server 2012 yet, can this be configured to do an update once a day instead of realtime?
And can extra tables and views be created on the subscribing database (our reporting databases)?
6. Merge replication:
This is meant to be for combining multiple data sources into one database.
But is looks like it allows for a scheduled update (once per day) and only updates the subscriber DB with the latest changes rather than doing an entire snapshot.
It requires adding a rowversion column to every table but we could handle this. Also with this solution would additional tables be able to be created on the subscriber database without the update getting out of sync?
The final option is that we use SSIS to select only the data we need from the OLTP databases. I think this options creates more risk as we would have to handle inserts/updates/deletes to our denormalized tables, rather than just drop and recreate the denormalized tables daily.
Any help on our options would be greatly appreciated.
If I've made any incorrect assumptions, please say.
If it were me, I'd go with transactional replication that runs continuously and have views (possibly indexed) at the subscriber. This has the advantage of not having to wait for the data to come over since it's always coming over.

Resources