Confirming data in a SQL Server 2008 mirror - sql-server

I have a SQL Server 2008 database set up for mirroring and was wondering if there was any way to generate a report for an audit showing that the data is being mirrored correctly and failing over would not result in any data loss. I can show using the database mirroring monitor that data is being transferred, but need a way to verify that the data matches (preferably without having to break the mirror).

Just query sys.database_mirroring, if the mirroring_state_desc is 'SYNCHRONIZED' then data is in the mirror. Make sure the transaction safety ('mirroring_safety_level') is FULL to guarantee no data loss on failover, see Mirroring states:
If transaction safety is set to FULL automatic failover and manual failover are both supported in the SYNCHRONIZED state, there is no data loss after a failover.
If transaction safety is off, some data loss is always possible, even in the SYNCHRONIZED state.
If the auditors don't trust the official product documentation, you can show the data content of a database snapshot of a mirror, since mirrors are not accessible. See Database Snapshots. Obviously, to do a meaningful comparison with a frozen snapshot you would have to freeze the source first, take the snapshot on mirror, run the comparison, then unfreeze the source. Which implies the database is read-only for the duration, any change will cause it to diverge from the snapshot and fail the comparison. An exercise in futility, with downtime, as the documentation clearly states that a syncronized full protected mirror is guaranteed to be identical with the source.

Related

Is it possible to create a lock free SNAPSHOT transaction in SQL Server?

TL;DR: Is it possible to basically create a fast, temporary, "fork" of a database (like a snapshot transaction) without any locks given that I know for a fact that the changes will never be committed and always be rolled back.
Details:
I'm currently working with SQL Server and am trying to implement a feature where the user can try all sorts of stuff (in the application) that is never persisted in the database.
My first instinct was to (mis)use snapshot transactions for that to basically "fork" the database into a short lived (under 15min) user-specific context. The rest of the application wouldn't even have to know that all the actions the user performs will later be thrown away (I currently persist the connection across requests - it's a web application).
Problem is that there are situations where the snapshot transaction locks and waits for other transactions to complete. My guess is that this happens because SQL server has to make sure it can merge the data if one of the open transactions commits, but in my case I know for a fact that I will never commit the changes from this transactions and always throw the data away (note that not everything happens in this transactions, there are other things that a user can do that happen on a different connection and are persisted).
Are there other ideas, that don't involve cloning the database (too large/slow) or updating/changing the schema of all tables (I'd like to avoid "poisoning" the schema with the implemenation detail of the "try out" feature).
No. SQL Server has copy-on-write Database Snapshots, but the snapshots are read-only. So where a SNAPSHOT transaction acquires regular exclusive locks when it modifies the database, a Database Snapshot would just give you an error.
There are storage technologies that can a writable copy-on-write storage snapshot, like NetApp. You would run a command to create a new LUN that is a snapshot of an existing LUN, present it to your server as a disk, mount its volume in a folder or drive letter, and attach the files you find there as a database. This is often done for cloning across environments to refresh dev/test with prod data without having to copy all the data. But it seems like way too much infrastructure work for your use case.

Azure SQL - Automatic Tuning with Geo-replication - Server in unspecified state and query store has reached its capacity limit

I have a primary db and a secondary geo-replicated db.
On the primary, the server atuomatic tuning is turned on.
On the replica when I try to do the same I encounter the following issues.
The database is inheriting settings from the server, but the server is
in the unspecified state. Please specify the automatic tuning state on
the server.
And
Automated recommendation management is disabled because Query Store
has reached its capacity limit and is not collecting new data. Learn
more about the retention policies to maintain Query Store so new data
can be collected.
However, on the server, tuning options are on so I don't understand that "unspecified state". Moreover, why I look at the Query Store set up in both databases properties in SSMS they are exactly the same with 9MB of space available out of 10MB.
Note: both databases are setup on the 5 DTUs basic pricing plan.
UPDATE
While the primary db Query store Operation Mode is Read Write, the replica is Read Only. It seems I cannot change it (I couldn't from the properties dialog of the db in SSMS).
Fair enough but how can the same query be 10 times faster on primary than on replica. Aren't optimizations copied accross?
UPDATE 2
Actually Query Store are viewable on SSMS and I can see that they are identical in both db. I think the difference in response times that I observe is not related.
UPDATE 3
I marked #vCillusion's post as the answer as he/she deserves credits. However, it's too detailed with regards to the actual issue.
My replica is read-only and as such cannot be auto-tuned as this would require writing in the query store. Azure not being able to collect any data into the read only query store led to a misleading (and wrong) error message about the query store reaching its capacity.
We get this message only when the Query Store is in read-only mode. Double check your query store configuration.
According to MSDN, you might need to consider below:
To recover Query Store try explicitly setting the read-write mode and recheck actual state.
ALTER DATABASE [QueryStoreDB]
SET QUERY_STORE (OPERATION_MODE = READ_WRITE);
GO
SELECT actual_state_desc, desired_state_desc, current_storage_size_mb,
max_storage_size_mb, readonly_reason, interval_length_minutes,
stale_query_threshold_days, size_based_cleanup_mode_desc,
query_capture_mode_desc
FROM sys.database_query_store_options;
If the problem persists, it indicates corruption of the Query Store data and continues on the disk. We can recover Query Store by executing sp_query_store_consistency_check stored procedure within the affected database.
If that did not help, you could try to clear Query Store before requesting read-write mode.
ALTER DATABASE [QueryStoreDB]
SET QUERY_STORE CLEAR;
GO
ALTER DATABASE [QueryStoreDB]
SET QUERY_STORE (OPERATION_MODE = READ_WRITE);
GO
SELECT actual_state_desc, desired_state_desc, current_storage_size_mb,
max_storage_size_mb, readonly_reason, interval_length_minutes,
stale_query_threshold_days, size_based_cleanup_mode_desc,
query_capture_mode_desc
FROM sys.database_query_store_options;
If you checked it and it's in read-write mode, then we might be dealing with some bug here. Please provide feedback to Microsoft on this.
Additional points of limitation in query store:
Also, note Query Store feature is introduced to monitor performance and is still evolving. There are certain known limitations around it.
As of now, it does not work on Read-Only databases (Including read-only AG replicas). Since readable secondary Replicas are read-only,
the query store on those secondary replicas is also read-only. This
means runtime statistics for queries executed on those replicas are
not recorded into the query store.
Check database is not Read-Only.
Query store didn't work for system databases like master or tempdb
Check if Primary filegroup have enough memory since the Data is stored only in Primary filegroup
The supported scenario is that Automatic tuning needs to be enabled on the primary only. Index automatically created on the primary will be automatically replicated to the secondary. This process takes the usual sync up time between the primary and the secondary. At this time it is not possible to have secondary read only replica tuned differently for performance than the primary. The query store error message is due to its read-only state as note above, and should be disregarded. The performance issue to your secondary replica most likely needs to be explored through some other reasons.

cdc when source system is sql server azure/standard 2008+

Let us say my target staging db/data warehouse is sql server 2008+ enterprise. However, my source systems are sql server azure/standard 2008+. Can I still exploit CDC? As far as I understand, I cannot as I have to turn CDC on in the source systems and it is only available for eneterprise editions. Is this correct? I am also curious what happens if the transaction log is truncated. Thanks.
I just googled it and... if you need this for replicating into a data warehouse you probably only need change tracking https://technet.microsoft.com/en-us/library/cc280519(v=sql.105).aspx. This http://azure.microsoft.com/en-us/documentation/articles/sql-database-preview-whats-new/ says change tracking is available in Azure.
I don't see any specific info anywhere about whether change tracking uses the transaction log, but this info is in one of the links:
The tracking mechanism in change data capture involves an asynchronous
capture of changes from the transaction log so that changes are
available after the DML operation. In change tracking, the tracking
mechanism involves synchronous tracking of changes in line with DML
operations so that change information is available immediately.

what is the best way to replicate database for SSRS

I have installed SQL server database (mainserver) in one instance and SQL server database for RerportServer in others. what is the best way to replicate data from mainServer to report Server? Data in mainServer changes frequently and actual information in the ReportSever is very important.
And there is many ways to do this:
mirroring
shipping log
transactional replication
merge replication
snapshot replication
are there some best-practices about this?
Thanks
You need Transactional Replication for your case. Here is why you would not need the other 4 cases:
Mirroring
This is generally used to increase the availability of a database server and provides for automatic failover in case of a disaster.
Typically even though you have more than a single copy of the database (recommended to be on different server instances), only one of them is active at a time, called the principle server.
Every operation on this server instance is mirrored on the others continuously (as soon as possible), so this doesn't fit your use case.
Log Shipping
In this case, apart from the production database servers, you have extra failover servers such that the backup of the production server's database, differential & transactional logs are automatically shipped (copied) to the failovers, and restored.
The replication here is relatively scheduled to be at a longer interval of time than the other mechanisms, typically ranging from an hour to a couple of hours.
This also provides for having the failver servers readies manually in case of a disaster at the production sites.
This also doesn't fit your use case.
Merge Replication
The key difference between this and the others is that the replicated database instances can communicate to the different client applications independent of the changes being made to each other.
For example a database server in North America being updated by clients across Americas & Europe and another one in Australia being updated by clients across the Asia-Pacific region, and then the changes being merged to one another.
Again, it doesn't fit your use case.
Snapshot Replication
The whole snapshot of the database is published to be replicated to the secondary database (different from just the log files being shipped for replication.)
Initially however, for each type of replication a snapshot is generated to initialized the subscribing database, i.e only once.
Why you should use Transactional Replication?
You can choose the objects (Tables, Views, etc) to be replicated continuously, so if there are only a subset of the tables which are used to reporting, it would save a lot of bandwidth. This is not possible in Mirroring and Log Shipping.
You can redirect traffic from your application to the reporting server for all the reads and reports (which you can also do in others too, btw).
You can have independent batch jobs generating some of the more used reports running on the reporting server, reducing the load on the main server if it has quite frequent Inserts, Updates or Deletes.
Going through your list from top to bottom.
Mirroring: If you mirror your data from your mainServer to your reportServer you will not be able to access your reportServer. Mirroring puts the mirrored database into a continuous restoring state. Mirroring is a High Availability solution. In your case the reportServer will only be available to query if you do a fail over. The mirrored server is never operational till fail over. This is not what you want as you cannot use the reportServer till it is operational.
Log Shipping: Log shipping will allow you to apply transactional log backups on a scheduled event to the reportServer. If you backup the transaction log every 15 minutes and apply the data to the reportServer you will have a delay of 15+ minutes between your mainServer and Log server. Mirroring is actually real time log shipping. Depending on how you setup log shipping your client will have to disconnect while the database is busy restoring the log files. Thus during a long restore it might be impossible to use reporting. Log Shipping is also a High Availability feature and not really useful for reporting. See this link for a description of trying to access a database while it is trying to restore http://social.msdn.microsoft.com/forums/en-US/sqldisasterrecovery/thread/c6931747-9dcb-41f6-bdf4-ae0f4569fda7
Replication : I am lumping all the replication together here. Replication especially transactional replication can help you scale out your reporting needs. It would generally be mush easier to implement and also you would be able to report on the data all of the time where in mirroring you cant report on the data in transaction log shipping you will have gaps. So in your case replication makes much more sense. Snapshot replication would be useful if your reports could be say a day old. You can make a snapshot every morning of the data you need from mainServer and publish this to the subscribers reportServer. However if the database is extremely large then Snapshot is going to be problematic to deal with on a daily basis. Merge replication is only usefull when you want to update the replicated data. In your case you want to have a read only copy of the data to report on so Merge replication is not going to help. Transactional Replication would allow you to send replications across the wire. In your case where you need frequently updated information in your reportServer this would be extremely useful. I would probably suggest this route for you.
Just remember that by implementing the replication/mirroring/log shipping you are creating more maintenance work. Replication CAN fail. So can mirroring and so can transaction log shipping. You will need to monitor these solutions to make sure they are running smoothly. So the question is do you really need to scale out your reports to another server or maybe spend time identifying why you cant report on the production server?
Hope that helps!

Is it possible to have secondary server available read-only in a log shipping scenario?

I am looking into using log shipping in a SQL Server 2005 environment. The idea was to set up frequent log shipping to a secondary server. The intent: Use the secondary server to serve report queries, thereby offloading the primary db server.
I came across this on a sqlservercentral forum thread:
When you create the log shipping you have 2 choices. You can configure restore log operation to be done with norecovery or with standby option. If you use the norecovery option, you can not issue select statements on it. If instead of norecovery you use the standby option, you can run select queries on the database.
Bear in mind with the standby option when log file restores occur users will be kicked out without warning by the restore process. Acutely when you configure the log shipping with standby option, you can also select between 2 choices – kill all processes in the secondary database and perform log restore or don’t perform log restore if the database is being used. Of course if you select the second option, the restore operation might never run if someone opens a connection to the database and doesn’t close it, so it is better to use the first option.
So my questions are:
Is the above true? Can you really not use log shipping in the way I intend?
If it is true, could someone explain why you cannot execute SELECT statements to a database while the transaction log is being restored?
EDIT:
First question is duplicate of this serverfault question. But I still would like the second question answered: Why is it not possible to execute SELECT statements while the transaction log is being restored?
could someone explain why you cannot
execute SELECT statements to a
database while the transaction log is
being restored?
Short answer is that RESTORE statement takes an exclusive lock on the database being restored.
For writes, I hope there is no need for me to explain why they are incompatible with a restore. Why does it not allow reads either? First of all, there is no way to know if a session that has a lock on a database is going to do a read or a write. But even if it would be possible, restore (log or backup) is an operation that updates directly the data pages in the database. Since these updates go straight to the physical location (the page) and do not follow the logical hierarchy (metadata-partition-page-row), they would not honor possible intent locks from other data readers, and thus have the possibility to change structures as they are read. A SELECT table scan following the page next-prev pointers would be thrown into disarray, resulting in a corrupted read.
Well yes and no.
You can do exactly what you wish to do, in that you may offload reporting workloads to a secondary server by configuring Log Shipping to a read only copy of a database. I have set this type of architecture up on a number of occasions previously and it works very well indeed.
The caveat is that in order to perform a restore of a Transaction Log Backup file there must be no other connections to the database in question. Hence the two choices being, when the restore process runs it will either fail, thereby prioritising user connections, or it will succeed by disconnecting all user connection in order to perform the restore.
Dependent on your restore frequency this is not necessarily a problem. You simply educate your users to the fact that, say every hour at 10 past the hour, there is a possibility that your report may fail. If this happens simply re-run the report.
EDIT: You may also want to evaluate alternative architeciture solutions to your business need. For example, Transactional Replication or Database Mirroring with a Database Snapshot
If you have enterprise version, you can use database mirroring + snapshot to create read-only copy of the database, available for reporting, etc. Mirroring uses "continuous" log shipping "under the hood". It is frequently used in scenario you have described.
Yes it's true.
I think the following happens:
While the transaction log is being restored, the database is locked, as large portions of it are being updated.
This is for performance reasons more then anything else.
I can see two options:
Use database mirroring.
Schedule the log shipping to only occur when the reporting system is not in use.
Slight confusion in that, the norecovery flag on the restore means your database is not going to be brought out of a recovery state and into an online state - that is why the select statements will not work - the database is offline. The no-recovery flag is there to allow you to restore multiple log files in a row (in a DR type scenario) without bringing the database back online.
If you did not want to log ship / have the disadvantages you could swap to a one way transactional replication, but the overhead / set-up will be more complex overall.
Would peer-to-peer replication work. Then you can run queries on one instance and so save the load on the original instance.

Resources