Replication Data Issue - sql-server

After configuring replication all the data from tables has replicated from publisher to subscriber. Now my requirement is whenever I insert new record in publisher only that record should replicate rather than reinitialize all the records to subscriber because my table is having millions of records.
Rather than drop and reinitialize snapshot cont we store earlier sync data of table in subscriber and only sync the new record.
Is there any way to do this?

If you're using transactional replication (as you say above in your comments), something is configured incorrectly. Transactional replication is supposed to work as you describe (i.e. a one row change gets pushed over and applied as a one row change). I'd guess that you have the immediate_sync property turned on at the publisher and are running the snapshot agent on a schedule. At the very least, turn off the schedule for the snapshot agent and I think you'll see your problem go away.

Related

Can you insert into a replicated SQL Server DB?

I need to store some data in a SQL DB for DataWarehousing purposes.
We will be using a replicated SQL Server Database.
Is it possible to insert into only the replicated DB (and not the main DB) so that we do not effect the main DB and still allow reporting and extraction of data out of the replicated DB?
Yes, but I would advise against it. Specifically, I tend to treat replication subscribers as expendable. Which is to say that I make the choice to not back them up. What you're suggesting means that there is data in the system that exists only at the subscriber which implies that the subscriber should be backed up. You're now re-backing up days that has been backed up at the publisher.
Also, I'd completely advise against putting that data in the same table as is being subscribed. On an article re-initialization, there's too much risk of it being deleted.

SQL Server 2008 R2 - Change Tracking, Database Snapshots, invalid rows

I have a system in place where a SQL Server 2008 R2 database is being mirrored to a second server. The database on the first server has change tracking turned on and on the second server we create a database snapshot of the mirror database in order to pull data for an ETL system with the change tracking functions. Both databases have an isolation level of Read Committed
Occasionally when we pull the change tracking information from the database snapshot,using the changetable function, there is a row that is either an insert or update (i,u) but the row is not in the base table, it is deleted. When I go to check the first server, the changetable function shows only a 'd' row for the same primary key. It appears that the snapshot was taken right at the time the delete happened on the base table but before the information was tracked by change tracking.
My questions are:
Is the mechanism for capturing change tracking in the change tracking internal tables asynchronous?
EDIT:
The mechanism for change tracking is synchronous and a part of the transaction making the change to the table. I found that information here:
http://technet.microsoft.com/en-us/magazine/2008.11.sql.aspx
So that leaves me with these questions.
Why am I seeing the behavior I outlined above? If change tracking is synchronous why would I get inserts/updates from change tracking that don't have a corresponding row in the table?
Would changing the isolation level to snapshot isolation help alleviate the above situation?
EDIT:
After reading more about snapshot isolation recommendation it is recommended so that you can wrap your call to change_tracking_current_version and the call to changetable(changes...) in a transaction so that everything stays consistent. The database snapshot should be doing that for me since it is as of a point in time. I pass in the value from change_tracking_current_version into changetable(changes...) function.
Let me know if you need any more information!
Thanks
Chris

Resychronize merge replication

I have an issue where there was a merge replication between 2 instances for around 10 articles that has now been dropped. I want to recreate the merge replication - I am looking for inputs on the steps/ different options to set it up again and synchronize.
The subscriber is remote and not a part of the LAN. Please note that I have the scripts to create the replication.
This is what I am thinking of doing:
backup current publisher and restore it to the subscriber instance in a different name
restore a copy of the subscriber in a different name
run compare using a tool that generates scripts, like those from red gate
apply the script generated on the restored subscriber db.
After this, what do you think is the best way to set the replication back to running?
Any advise appreciated. thankyou
There is two thing to check before you backup and restore.
Make sure that you have all data from publisher and subscriber in one database. It could be publisher. If you hadve ETLs which loading you publisher and subscribers database from diffeent source this point is pretty important.
run http://technet.microsoft.com/en-us/library/ms188734%28v=sql.105%29.aspx on both publisher and subscriber
Script out all your indexes if you need reduce backup file. You can create them l8r once you will be in sync.
backup db on publisher and restore it pn subscriber
Next
create publication
create snapshot
add login to the access list of your publication
add articles for publication
create script drop/create indexes. Create scrip to drop/create indexes on tables classified as “big data” to prevent snapshotting indexes.
Do this for constraints, too. They slow up your action..
Just drop them all. From step 9
Snapshot your stuff.
Now subscriber
add pull subscription. You have two steps. Script on publisher and script on subscriber.
stop agents on subscriber and change GENERATION_LEVELING_THRESHOLD if you need or change subscriber agent profile.
You can now start pull agents.
Remember about replication index maintenance
Hope that help

SQL Server 2005/8 Replication Transaction ID

I have a scenario where I'm using transactional replication to replicate multiple SQL Server 2005 databases (same instance) into a single remote database (different instance on a separate physical machine).
I am then performing some processing on the replicated data for reporting purposes. I'm using table level triggers to identify changes which actions my post processing code.
Up to this point everything is fine.
However, what I'd like to know is, where certain tables are created, updated or deleted in the same transaction, is it possible to identify some sort of transaction ID from replication (or anywhere) so then I don't perform the same post processing multiple times for a single transaction.
Basic Example: I have a TUser Table and TAddress table. If I was to create both in a single transaction, they would be replicated across in a single transaction too. However, there would be two triggers fired in the replicated database - which at present causes my post processing code to be run twice. What I'd really like to identify is that these two changes arrived in the replicated in the same transaction.
Is this possible in any way? Does an identifier as I've describe exist and is it accessible?
Short answer is no, there is nothing of the sort that you can rely on. Long answer in summary would be that yes it exists, but it would not be recommended in any way to be used for anything.
Given that replication is transactionally consistent, one approach you could consider would be pushing an identifier for the primary record (in this case TUser, since an TAddress is related to TUser) onto a queue (using something like Service Broker ideally or potentially a user-defined queue) and then perform the post-processing by popping data off the queue and processing separately.
Another possibility would be simply batch processing every 'x' amount of time by polling for new/updated records from the primary tables and post-processing in that manner - you'd need to track id's, rowversions, or timestamps of some sort that you've processed for each primary table as meta-data and pull anything that hasn't yet been processed during each batch run.
Just a few thoughts, hope that helps.

SQL Server 2005 Replication

Environment:
SQL Server 2005 SP2 (9.0.3077)
Transactional Publications (Production and Beta)
I have a situation where I have two different Replication Publications setup that use some of the same Articles. Each of these Publications feeds a subscriber on a different machine. One of these shared Articles is a table. At a regular time interval many of the records in this table become aged and no longer needed. At this time a stored procedure that deletes records is called.
To save on resources and improve latency times to the subscribers I have set the replicate property on this stored procedure to “Execution of the stored procedure” instead of the default “Stored procedure definition only”. This way when the stored procedure deletes 2,000,000+ records these don’t replicate down to the subscribers. Instead the execution of the stored procedure is replicated and the same replicated stored procedure on the subscribers is executed and it deletes the same 2,000,000+ rows.
The problem I’m having is with my second publication. I didn’t need this type of behavior so I left the article property on the stored procedure set to “Stored procedure definition only” and was expecting replication to remove the rows at the other subscriber but it wasn’t. The table at the subscriber just kept gaining records. So to fix it I set the Article Property to "Execution..." and called it good. Which is probably the best solution so beta matches production, but it still feels like a kludge as the publication properties should work independently of each other.
Question: Why does the “Execution of the stored procedure” article property take precedence and get applied to the other publication even though it is set to “Stored procedure definition only” in the other publication?
We use replication extensively in our company as we have 38 warehouses in several countries all replicating back to our primary server in London.
Firstly, your replication filters should use Views, even the simple ones. That way, if you need to adjust the filter (read WHERE clause), you just need to alter the view and your done. Otherwise you have to re-publish your data, and re-subscribe everyone which can be a real pain.
You mentioned that you run the same delete on both subscriber and publisher to keep them in-sync. This sends shivers down my spine. Your far better off deleting them in one place and letting the server replicate out to the subscribers the changes made. Since SQL Server 2005, replication is very fast and efficient now. SQL 2000 was and is quite slow for replication. If your using SQL 2005/2008, just make sure your compatibility level (right click on db, properties, options) is set to 90 (2005) or 100 (2008). This switches sql server over to the fast and efficient replication methods.
Another way is to not delete the data, but to keep it and filter it out using a where clause in the publication.
It has been a long time since I actively administered replication but I suspect the answer has to do with the architecture of the log-reader and that you are sharing an article between publications. My understanding is that the log-reader will trawl through the log and look for operations on items that are replicated. Depending on the article settings, the individual changes to the data may be posted to a table in the distribution database or a record of the procedure invocation will be posted. In any case, this is a property of the article and not the publication(s) that the article is a member of. I assume (but have not tested and verified) that you can create multiple articles on top of the same database object and have one be replicated with #type='logbased' and the other with #type='proc exec'
Take all of this with a large pinch of salt: although I now develop on SQL 2008, the last time I did anything with replication was SQL 7.
pjjH

Resources