Data Replication REDO rate very low on read-able secondary - sql-server

I am currently running MS SQL Server 2016 with 1 primary replicating to 1 synchronous node and 4 asynchronous nodes (1 in same data center, 3 in a failover data center).
My current problem is that my read-able secondary in our primary data center is showing a REDO rate of 25k kb/s, where as all my other nodes (including the other read-able secondary) is showing 120k-250k kb/s REDO rate. So the issue I'm running into is when a large data sync or index rebuild happens is I'll get a 40GB log file where the send rate is wonderful and everything is hardened in the log within minutes. But the REDO rate is taking upwards of hours sometimes to commit it all on the one node.
The only difference I can see on this node is that we have SSAS and SSIS services installed on this node however, no traffic to either of those systems during these large transfers. I am curious if anyone knows anything about whether it's sharing REDO rate between SSAS, SSIS & SQL Server and that is where my issue is. Only way I can really test is shutting down those services right now and since it's a production system I'd rather not be doing that if I can avoid it.
Wondering if anyone has seen an issue similar where one of their nodes were getting much lower than expected REDO rates and how you went about fixing it?

Related

SQL Server files configuration

I have a database that receives information per second from multiple devices via SNMP, GPS and others. The insertion rate of each device is 1sec and we speak of a group of devices ranging between 80 and 120.
This is generating a significant growth of the database, to the point that it is growing at a rate increasingly close to 1 gb daily.
What is the growth configuration and minimum sizes of the data server (SQL Server 2016) should have in its logs and data files so the db is as fluid as possible?
Any additional recommendations regarding capabilities, maintenance and best practices in relation to the scenario described above?
Thank you all!
You should never have auto-growth events. Ideally, you would want to grow your data files to a size large enough so that you will not be troubled by auto-growth events, ever.
See this post over at DBA Stack Exchange that relates to growing log files.

Offloading SQL Report Server processing to a dedicated server - is it worth it?

Not sure if this is a SO or a ServerFault question, so please feel free to move if it's not in the right place:
I have a client with a large database containing a table with around 30-35 million rows running on a SQL2008R2 server (the server is pretty high spec, 16 cores, 92 gig ram, RAID etc). There are other tables this table may join on, but it is the main driver of a several reports.
Their SSRS instance/database and query source database are both running on the same box/sql instance
They regularly run ad-hoc reports from this database (which have undergone extensive optimisation), many of which may end up touching a lot of the data in the table. After looking at the report server stats it appears that the data fetch doesn't actually take that long, but a lot of data is returned and report processing takes a fair while: it can take up to 20-30 minutes to process some of the larger reports, which can have tens of thousands of pages (the data fetch in these cases is less than 10 seconds).
(Note: I realise that there is never really a need to run 25,000 pages off but the client insists and won't listen to reason...something about Excel spreadsheets *FACEPALM!*)
At the moment they are concerned about a couple of performance issues that crop up sporadically and the culprit may be the ad-hoc reporting.
We are looking at offloading the report processing anyway, so thought that this would be an ideal opportunity - but before doing so I'm wondering how much relief this will give the SQL server.
If I move the SSRS app and database onto another SQL host and remotely query the data (network conditions should be ideal as this is datacentre based), will I see any performance gains?
This is mainly based on guesswork at this stage but I see the following being the factors that could affect performance:
I/O for moving a shedload of rows from the query source to RS temp DB
CPU load when the report server is crunching all the data
In moving to another host I see these factors being reduced for the SQL server. The new server will be solely responsible for report processing (and should also be high spec), so hopefully there will be no contention when processing reports.
Do I sound like I am on the right track in my assumptions? Is there anything else that I may have missed which could adversely affect performance or improve performance?
Thanks in advance
You should look at transactional replication to send data from the main server to a database on the reporting server. Querying the tables directly over the network will only slow things down even more.

Symmetric DS taking too long on DataExtractorService

I'm using SDS to migrate data from a SQL server to a Mysql database. My tests of moving the data of a database that was not in use worked correctly though they took like 48 hours to migrate all the existing data. I configured dead triggers to move all current data and triggers to move the new added data.
When moving to a live database that it is in use the data is being migrated too slow. On the log file I keep getting the message:
[corp-000] - DataExtractorService - Reached one sync byte threshold after 1 batches at 105391240 bytes. Data will continue to be synchronized on the next sync
I have like 180 tables and I have created 15 channels for the dead triggers and 6 channels for the triggers. For the configuration file I have:
job.routing.period.time.ms=2000
job.push.period.time.ms=5000
job.pull.period.time.ms=5000
I have none foreign key configuration so there wont be an issue with that. What I would like to know is how to make this process faster. Should I reduce the number of channels?
I do not know what could be the issue since the first test I ran went very well. Is there a reason why the threshold is not being clreared.
Any help will be apreciated.
Thanks.
How large are your tables? How much memory does the SymmetricDS instance have?
I've used SymmetricDS for a while, and without having done any profiling on it I believe that reloading large databases went quicker once I increased available memory (I usually run it in a Tomcat container).
That being said, SymmetricDS isn't by far as quick as some other tools when it comes to the initial replication.
Have you had a look at the tmp folder? Can you see any progress in file size. That is, the files which SymmetricDS temporarily writes to locally before sending the batch off to the remote side? Have you tried turning on more fine grained logging to get more details? What about database timeouts? Could it be that the extraction queries are running too long, and that the database just cuts them off?

SQL server 2005 replication to many slave servers - hardware replication or change the strategy

we have a 500gb database that performs about 10,000 writes per minute.
This database has a requirements for real time reporting. To service this need we have 10 reporting databases hanging off the main server.
The 10 reporting databases are all fed from the 1 master database using transactional replication.
The issue is that the server and replication is starting to fail with PAGEIOLATCH_SH errors - these seem to be caused by the master database being overworked. We are upgrading the server to a quad proc / quad core machine.
As this database and the need for reporting is only going to grow (20% growth per month) I wanted to know if we should start looking at hardware (or other 3rd party application) to manage the replication (what should we use) OR should we change the replication from the master database replicating to each of the reporting databases to the Master replicating to reporting server 1, reporting server 1 replicating to reporting server 2
Ideally the solution will cover us to a 1.5tb database, with 100,000 writes per minute
Any help greatly appreciated
One common model is to have your main database replicate to 1 other node, then have that other node deal with replicating the data out from there. It takes the load off your main server and also has the benefit that if, heaven forbid, your reporting system's replication does max out it won't affect your live database at all.
I haven't gone much further than a handful of replicated hosts, but if you add enough nodes that your distribution node can't replicate it all it's probably sensible to expand the hierarchy so that your distributor is actually replicated to other distributors which then replicate to the nodes you report from.
How many databases you can have replicated off a single node will depend on how up-to-date your reporting data needs to be (EG: Whether it's fine to have it only replicate once a day or whether you need to the second) and how much data you're replicating at a time. Might be worth some experimentation to find out exactly how many nodes 1 distributor could power if it didn't have the overhead of actually running your main services.
Depending on what you're inserting, a load of 100,000 writes/min is pretty light for SQL Server. In my book, I show an example that generates 40,000 writes/sec (2.4M/min) on a machine with simple hardware. So one approach might be to see what you can do to improve the write performance of your primary DB, using techniques such as batch updates, multiple writes per transaction, table valued parameters, optimized disk configuration for your log drive, etc.
If you've already done as much as you can on that front, the next question I have is what kind of queries are you doing that require 10 reporting servers? Seems unusual, even for pretty large sites. There may be a bunch you can do to optimize on that front, too, such as offloading aggregation queries to Analysis Services, or improving disk throughput. While you can, scaling-up is usually a better way to go than scaling-out.
I tend to view replication as a "solution of last resort." Once you've done as much optimization as you can, I would look into horizontal or vertical partitioning for your reporting requirements. One reason is that partitioning tends to result in better cache utilization, and therefore higher total throughput.
If you finally get to the point where you can't escape replication, then the hierarchical approach suggested by fyjham is definitely a reasonable one.
In case it helps, I cover most of these issues in depth in my book:
Ultra-Fast ASP.NET.
Check that your publisher and distributor's transaction log files don't have too many VLFs (Virtual Log Files) as detailed here (step 8):
http://www.sqlskills.com/BLOGS/KIMBERLY/post/8-Steps-to-better-Transaction-Log-throughput.aspx
If your distribution database is co-located with you publisher database, consider moving it to its own dedicated server.

SQL Server Merge Replication Problems

I have Merge replication setup on a CRM system. Sales reps data merges when they connect to the network (I think when SQL detects the notebooks are connected), and then they take the laptops away and merge again when they come back (there are about 6 laptops in total merging via 1 server).
This system seems fine when initially setup, but then it almost grinds to a halt after about a month passes, with the merge job taking nearly 2 hours to run, per user, the server is not struggling in any way.
If I delete the whole publication and recreate all the subscriptions it seems to work fine until about another month passes, then I am back to the same problem.
The database is poorly designed with a lack of primary keys/indexes etc, but the largest table only has about 3000 rows in it.
Does anyone know why this might be happening and if there is a risk of losing data when deleting and recreating the publication?
The problem was the metadata created by sql server replication, there is an overnight job that emptys and refills a 3000 row table. This causes replication to replicate all of these rows each day.
The subscriptions were set to never expire which means the old meta data was never being deleted by sql server.
I have set the subscription period to 7 days now in the hope tat it will now clean up the meta data after this period. I did some testing and proved that changes were not lost if a subscription expired. But any updates on the server took priority over the client.
I have encountered with "Waiting 60 second(s) before polling for further changes" recently in 2008 R2.
Replication monitor shows "In progress state" for replication but only step 1 (Initialization) and step 2 (Schema changes and bulk inserts) were performed.
I was very puzzled why other steps do not executed?
The reason was simple - it seems that for merge replication demands tcp/ip (and or not sure) named pipes protocols activation.
No errors were reported.
Probably the similar problem (some sort of connection problem) became apparent in Ryan Stephens case.

Resources