Identical tables, different sizes? - sql-server

I have transactional replication with updatable subscriptions going between a few SQL 2008 R2 servers (publisher is Enterprise, subscribers are Express).
I need to add another subscriber, and come to discover that my database has outgrown the 10GB limit for Express. My current subscribers are under the 10GB limit, however the publishing database is 13GB.
So I delete some large unused columns and data from the largest tables, run dbcc cleantable, run update statistics on them, the tables go down in size a bit and I thought I was good to go!
However, the publishing database is still a good 11.5GB while the subscribers all went down to 8GB.
I compare table sizes between the publishers and subscribers and the few largest tables that I had deleted data from are larger in the publishing database than the subscribing databases - by a couple gigs.
I compare table structures and use RedGate's Data Compare - the tables are identical between the publisher and subscribers so I am at a loss. I don't know what is cause the discrepancy let alone how to resolve it so I can add another subscriber (without having to buy SQL Standard licenses for the subscriber). I have a feeling it has to do with being the publisher and it's row-count has grown significantly within the last year.
As a side note - I do also have a couple SQL Standard 2008 licenses, however they're 2008, not 2008 R2 therefore incompatible to initialize the subscriber using a backup. The sites have slow connections so I have always initialized replication from backups.

Would it be possible to drop the replication and recreate it? Replication always seems to be finicky and it might still have remnants of the columns out there (where you can't see it)
Remember you can script the rep so you don't have to start from scratch.

Related

SQL Server queries run fast against prod but not the replication DB

Experienced a 4 min slow down when running reports against SQL Server replication database, specifically created to run reports against.
Prod runs fine, replication use to run in < 1, now takes 4 minutes.
We did two things prior to slowdown:
Truncate log file from 400gb to 100mb
re-created replication job after new data was not happening on Monday
The items were working Friday. From what I can see it the replication is a smaller database as we dont use all the data in prod for reports. I think it might be related to the execution plan being recreated when the new replication job was created but seems very odd, any idea guys?
It's likely that your replicated database doesn't have the same indexes as the primary database. Check that primary key constraints are being replicated (in the article properties), and check that indexes are being replicated.
Take a look at all the indexes and keys in the replicated database and compare them to the source database. It sounds highly likely that they're different.

db replication vs mirroring

Can anyone explain the differences from a replication db vs a mirroring db server?
I have huge reports to run. I want to use a secondary database server to run my report so I can off load resources from the primary server.
Should I setup a replication server or a mirrored server and why?
For your requirements the replication is the way to go. (asumming you're talking about transactional replication) As stated before mirroring will "mirror" the whole database but you won't be able to query unless you create snapshots from it.
The good point of the replication is that you can select which objects will you use and you can also filter it, and since the DB will be open you can delete info if it's not required( just be careful as this can lead to problems maintaining the replication itself), or create specific indexes for the report which are not needed in "production". I used to maintain this kind of solutions for a long time with no issues.
(Assuming you are referring to Transactional Replication)
The biggest differences are: 1) Replication operates on an object-by-object basis whereas mirroring operates on an entire database. 2) You can't query a mirrored database directly - you have to create snapshots based on the mirrored copy.
In my opinion, mirroring is easier to maintain, but the constant creation of snapshots may prove to be a hassle.
As mentioned here
Database mirroring and database replication are two high data
availability techniques for database servers. In replication, data and
database objects are copied and distributed from one database to
another. It reduces the load from the original database server, and
all the servers on which the database was copied are as active as the
master server. On the other hand, database mirroring creates copies of
a database in two different server instances (principal and mirror).
These mirror copies work as standby copies and are not always active
like in the case of data replication.
This question can also be helpful or have a look at MS Documentation

MS SQL Transactional Replication - Skipping Error while applying snapshot at the subscriber

I tried finding this on the internet but could not find anything regarding it. There are ways to skip errors in the Distribution agent but nothing with respect to skipping Errors while applying Snapshot.
My Question: I have a Multi Publisher Single Subscriber setup. While setting up replication, the Snapshot of the first Publisher is successfully delivered to the subscriber. The snapshot of the consecutive Publishers is successfully generated but fails while applying it to the subscriber. The failure is due to Primary Key violation. Is there a way to skip errors while the snapshot is being applied on the subscriber?
Environment:
Publisher: Microsoft SQL Server 2008 R2 (SP2) - (X64)
Distributor: Microsoft SQL Server 2014 (SP2) (KB3171021) - (X64)
Subscriber: Microsoft SQL Server 2008 R2 (SP3-OD) (KB3144114) - (X64)
I have tried identifying the tables and records which are causing this issue but there are over 100 such tables having hundreds of records each.
Since replication is a client requirement, I don't have much control over the schema and the data in it.
It sounds like something in your setup is incorrect, which is leading to multiple tables from different publishers trying to insert rows into the same subscriber table, hence the duplicate key records.
If the different publishers all have the same copy of the same table, you only want to publish it from one of them.
If the different publishers all have different copies of the same table, you want them to each have their own subscriber tables.
Otherwise you'll end up missing a lot of rows in your subscriber (because different publishers are using the same key for rows that are actually different) or hitting weird replication errors. Just skipping the errors would result in having incorrect data-- and I'm guessing that's one of the client requirements as well.
One option that I have used in the past to simplify replication topography and management:
One subscriber database per publication
Never grant write access to users to these databases
Grant read access via another database which uses synonyms or views
This can make management simpler down the road as well. If you need to re-initialize a single database, you have the option to restore it from backup and generally more flexibility than if your subscribers are all sharing the same database.
Just for completeness, I should probably point you to the Books Online entry from Skipping Errors in Transactional Replication. But to be clear, I think this would be a mistake, as you'd end up with incorrect data -- and that's probably not what anyone wants.

SQL Server replication for 70 databases with transformation in a small time window

We have 70+ SQL Server 2008 databases that need to be copied from an OLTP environment to a separate reporting server. Once the DB's are copied, we will do some partial data transformation: de-normalization, row level security, etc.
SSRS Reports will be written based on these static denormalized tables and views.
We have a small nightly window for copying and transforming all 70 databases (3 hours).
Currently databases average about 10GB.
Options:
1. Transactional replication:
We would need to create 100+ static denormalized tables on each reporting database.
Doing this for all 70 databases almost reaches our nightly time limit.
As the databases grow we will exceed the time limit. We thought of mixing denormalized tables with views to speed up transformation. But then there would be some dynamic and some static data which is not a solution we can use.
Also with 70 databases using transactional replication we are concerned about bandwidth usage.
2. Snapshot replication:
Copy the entire database each night.
This means we could have a mixture of denormalized tables and views so the data transformation process is quicker.
But the snapshot is a full data copy, so as the DB grows, we will exceed our time limit for completing copy and transformation.
3. Log shipping:
In our nightly window, we could use the log shipping to update the reporting databases, then truncate and repopulate the denormalized tables and use some views.
However, I understand that with log shipping, extra tables and views cannot be added to the subscribing database.
4. Mirroring:
Mirroring is being deprecated, but also the DB is not active for reporting against until failover.
5. SQL Server 2012 AlwaysOn.
We don't have SQL Server 2012 yet, can this be configured to do an update once a day instead of realtime?
And can extra tables and views be created on the subscribing database (our reporting databases)?
6. Merge replication:
This is meant to be for combining multiple data sources into one database.
But is looks like it allows for a scheduled update (once per day) and only updates the subscriber DB with the latest changes rather than doing an entire snapshot.
It requires adding a rowversion column to every table but we could handle this. Also with this solution would additional tables be able to be created on the subscriber database without the update getting out of sync?
The final option is that we use SSIS to select only the data we need from the OLTP databases. I think this options creates more risk as we would have to handle inserts/updates/deletes to our denormalized tables, rather than just drop and recreate the denormalized tables daily.
Any help on our options would be greatly appreciated.
If I've made any incorrect assumptions, please say.
If it were me, I'd go with transactional replication that runs continuously and have views (possibly indexed) at the subscriber. This has the advantage of not having to wait for the data to come over since it's always coming over.

Recording all Sql Server Inserts and Updates

How can I record all the Inserts and Updates being performed on a database (MS SQL Server 2005 and above)?
Basically I want a table in which I can record all the inserts andupdates issues on my database.
Triggers will be tough to manage because there are 100s of tables and growing.
Thanks
Bullish
We have hundreds of tables and growing and use triggers. In newer versions of SQL server you can use change Data Capture or Change Tracking but we have not found them adequate for auditing.
What we have is are two separate audit tables for each table (one for recording the details of the instance (1 row even if you updated a million records) and one for recording the actual old and new values), but each has the same structure and is created by running a dynamic SQL proc that looks for unauditied tables and creates the audit triggers. This proc is run every time we deploy.
Then you should also take the time to write a proc to pull the data back out of the audit tables if you want to restore the old values. This can be tricky to write on the fly with this structure, so it is best to have it handy before you have the CEO peering down your neck while you restore the 50,000 users accidentally deleted.
As of SQL Server 2008 and above you have change data capture.
Triggers, although unwieldy and a maintenance nightmare, will do the job on versions prior to 2008.

Resources