SQL Azure Backup: What does transactionally consistent mean? - sql-server

I'm using redgate's sql azure backup tool: http://www.red-gate.com/products/dba/sql-azure-backup/
It looks like if you check "Make Backup Transactionally Consistent" you get charged a full day's use for sql server. I'm wondering if I need to check this.
I do daily backups to blob storage and I backup the database to my local machine to work with every 3 days or so.
If I don't check the Transactionally Consistent box, am I going to run into any problems?

Well as the person who wrote SQL Azure Backup at Red Gate I can say that the only way to create a guaranteed transactionally consistent backup in Azure currently is indeed to use CREATE DATABASE ... AS COPY OF. This copy only exists for the duration of us taking the backup and is then dropped immediately afterwards.
If you don't check the box you'll only hit problems if there is a risk of transactions being in an inconsistent state when reading the data from each table in turn. CREATE COPY OF can take a very long time and also may cost money for the copy too.
If you're backing up to a BLOB you're using the Microsoft Import Export service rather than SQL Compare and SQL Data Compare technology but that also reads data from the tables to could be inconsistent too.
Hope this helps
Richard

AFAIK transactionally consistent means that you get a snapshot of the database at a point in time (which presumably means SQL Azure locks the db while (quickly we hope) it makes a copy of the entire database = your one day charge for a db that exists for only a few minutes).
This is better illustrated by non-transactionally consistent backup where begin by copying table X. While you are doing that someone amends (as it's a live database) table Y, which later gets copied to the backup. The foreign keys between X and Y might now not match 'cos X is from an earlier time period than Y.
I have used Sql Azure Backup and I did go for transactional consistency because the backups are for an emergency and the last thing I want in that scenario is inconsistencies in the data.
edit: now I think about it, Redgate should really state that if you backup every day you are effectively paying twice the rate for your database. I've been waiting for the sync framework which I think is there now...

To answer the question in the title: a SQL Azure database copy (the 'backup') is a SQL Azure database that is copied (fully online) from the source database and contains no uncommitted transactions (ie. is transactionally consistent). This is achieved the same way database snapshots or backup restores achieve consistency on the standalone SQL Server product: all pending transactions at the moment of 'separation' are rolled back.
As to why or how RedGate's product utilizes this, I don't know. I would venture a guess that in order to achieve a 'transitionally consistent backup' they are doing a CREATE DATABASE ... AS COPY OF ... (which creates the desired transactionally consistence) and then they use the technology from SQL Compare and Data Compare to copy out the schema and data.

Related

SQLServer differential backup and restore

I have a scenario in which I need to maintain a replica of existing database.
Is there a solution to achieve the below mentioned approach.
1. Take a full back once and restore to a destination database.
2. Scheduled( ex: Every day) differential backup(Only the data which has changed since last backup) of the source database and restore into the destination database
This is to avoid taking full backup and restore each time.
You can use Differential Backups, but you would need to ship a new Full backup periodically or the Differentials will continue to grow.
A better solution might be Log Shipping, where you can ship just the changes on whatever schedule you want.
You can consider configuring an availability group and use a secondary SQL server instance with asynchronous data sync. This should be considered only if the primary(original live SQL server) and secondary servers are in the same location\data centre. So you don't need to take backup-restore or do any extra work other than properly configuring it at the first time.
If that is not the case (copy should be available in another location\data center), it would be better to go with configuring log shipping.
First option is a lot better because it would contain the exact copy of the primary database (with a sync delay depending on various factors...probably seconds) and you can directly fail over to the secondary in case of any issues with the primary server.

Replicating a SQL Server database for read access

I have an application that is in production with its own database for more than 10 years.
I'm currently developing a new application (kind of a reporting application) that only needs read access to the database.
In order not to be too much linked to the database and to be able to use newer DAL (Entity Framework 6 Code First) I decided to start from a new empty database, and I only added the tables and columns I need (different names than the production one).
Now I need some way to update the new database with the production database regularly (would be best if it is -almost- immediate).
I hesitated to ask this question on http://dba.stackexchange.com but I'm not necessarily limited to only using SQL Server for the job (I can develop and run some custom application if needed).
I already made some searches and had those (part-of) solutions :
Using Transactional Replication to create a smaller database (with only the tables/columns I need). But as far as I can see, the fact that I have different table names / columns names will be problematic. So I can use it to create a smaller database that is automatically replicated by SQL Server, but I would still need to replicate this database to my new one (it may avoid my production database to be too much stressed?)
Using triggers to insert/update/delete the rows
Creating some custom job (either a SQL Job or some Windows Service that runs every X minutes) that updates the necessary tables (I have a LastEditDate that is updated by a trigger on my tables, so I can know that a row has been updated since my last replication)
Do you some advice or maybe some other solutions that I didn't foresee?
Thanks
I think that the Transactional replication is the better than using triggers.
Too much resources would be used in source server/database due to the trigger fires by each DML transaction.
Transactional rep could be scheduled as a SQL job and run it few times a day/night or as a part of nightly scheduled job. IT really depends on how busy the source db is...
There is one more thing that you could try - DB mirroring. it depends on your sql server version.
If it were me, I'd use transactional replication, but keep the table/column names the same. If you have some real reason why you need them to change (I honestly can't think of any good ones and a lot of bad ones), wrap each table in a view. At least that way, the view is the documentation of where the data is coming from.
I'm gonna throw this out there and say that I'd use Transaction Log shipping. You can even set the secondary DBs to read-only. There would be some setting up for full recovery mode and transaction log backups but that way you can just automatically restore the transaction logs to the secondary database and be hands-off with it and the secondary database would be as current as your last transaction log backup.
Depending on how current the data needs to be, if you only need it done daily you can set up something that will take your daily backups and then just restore them to the secondary.
In the end, we went for the Trigger solution. We don't have that much changes a day (maybe 500, 1000 top), and it didn't put too much pressure on the current database. Thanks for your advices.

The fastest backup/restore strategy for Azure SQL databases?

What is the fastest way to backup/restore Azure SQL database?
The background: We have the database with size ~40 GB and restoring it from the .bacbac file (~4GB of compressed data) in the native way by Azure SQL Database Import/Export Service takes up to 6-8 hours. Creating .bacpac is also very long and takes ~2 hours.
UPD:
UPD.
Creating the database (by the way transactional consistent) copy using CREATE DATABASE [DBBackup] AS COPY OF [DB] takes only 15 minutes with 40 GB database and the restore is simple database rename.
UPD. Dec, 2014. Let me share with you our experience about the fastest way of DB migration schema we ended up with.
First of all, the approach with data-tier application (.bacpac) turned out to be not viable for us after DB became slightly bigger and it also will not work for you if you have at least one non-clustered index with total size > 2 GB until you disable non-clustered indexes before export - it's due to Azure SQL transaction log limit.
We stick to Azure Migration Wizard that for data transfer just runs BCP for each table (parameters of BCP are configurable) and it's ~20% faster than approach with .bacpac.
Here are some pitfalls we encountered with the Migration Wizard:
We run into encoding troubles for non-Unicode strings. Make sure
that BCP import and export runs with same collation. It's -C ... configuration switch, you can find parameters with which BCP calling
in .config file for MW application.
Take into account that MW (at least the version that is actual at the moment of this writing) runs BCP with parameters that will leave the constraints in non-trusted state, so do not forget to check all non-trusted constraints after BCP import.
If your database is 40GB it's long past time to consider having a redundant Database server that's ready to go as soon as the main becomes faulty.
You should have a second server running alongside the main DB server that has no actual routines except to sync with the main server on an hourly/daily basis (depending on how often your data changes, and how long it takes to run this process). You can also consider creating backups from this database server, instead of the main one.
If your main DB server goes down - for whatever reason - you can change the host address in your application to the backup database, and spend the 8 hours debugging your other server, instead of twiddling your thumbs waiting for the Azure Portal to do its thing while your clients complain.
Your database shouldn't be taking 6-8 hours to restore from backup though. If you are including upload/download time in that estimate, then you should consider storing your data in the Azure datacenter, as well as locally.
For more info see this article on Business Continuity on MSDN:
http://msdn.microsoft.com/en-us/library/windowsazure/hh852669.aspx
You'll want to specifically look at the Database Copies section, but the article is worth reading in full if your DB is so large.
Azure now supports Point in time restore / Geo restore and GeoDR features. You can use the combination of these to have quick backup / restore. PiTR and Geo restore comes with no additional cost while you have to pay for
Geo replica
There are multiple ways to do backup, restore and copy jobs on Azure.
Point in time restore.
Azure Service takes full backups, multiple differential backups and t-log backups every 5 minutes.
Geo Restore
same as Point in time restore. Only difference is that it picks up a redundant copy from a different blob storage stored in a different region.
Geo-Replication
Same as SQL Availability Groups. 4 Replicas Async with read capabilities. Select a region to become a hot standby.
More on Microsoft Site here. Blog here.
Azure SQL Database already has these local replicas that Liam is referring to. You can find more details on these three local replicas here http://social.technet.microsoft.com/wiki/contents/articles/1695.inside-windows-azure-sql-database.aspx#High_Availability_with_SQL_Azure
Also, SQL Database recently introduced new service tiers that include new point-in-time-restore. Full details at http://msdn.microsoft.com/en-us/library/azure/hh852669.aspx
Key is to use right data management strategy as well that helps solve your objective. Wrong architecture and approach to put everything on cloud can prove disastrous... here's more to it to read - http://archdipesh.blogspot.com/2014/03/windows-azure-data-strategies-and.html

SQL Server backup/restore vs. detach/attach

I have one database which contains the most recent data, and I want to replicate the database content into some other servers. Due to non-technical reasons, I can not directly use replicate function or sync function to sync to other SQL Server instances.
Now, I have two solutions, and I want to learn the pros and cons for each solution. Thanks!
Solution 1: detach the source database which contains the most recent data, then copy to the destination servers which need the most recent data, and attach database at the destination servers;
Solution 2: make a full backup of source server for the whole database, then copy data to destination servers and take a full recovery at the destination server side.
thanks in advance,
George
The Detach / Attach option is often quicker than performing a backup as it doesn't have to create a new file. Therefore, the time from Server A to Server B is almost purely the file copy time.
The Backup / Restore option allows you to perform a full backup, restore that, then perform a differential backup which means your down time can be reduced between the two.
If it's data replication you're after, does that mean you want the database functional in both locations? In that case, you probably want the backup / restore option as that will leave the current database fully functional.
EDIT: Just to clarify a few points. By downtime I mean that if you're migrating a database from one server to another, you generally will be stopping people using it whilst it's in transit. Therefore, from the "stop" point on Server A up to the "start" point on Server B this could be considered downtime. Otherwise, any actions performed on the database on server A during transit will not be replicated onto server B.
In regards to the "create a new file". If you detach a database you can copy the MDF file immediately. It's already there ready to be copied. However, if you perform a backup, you have to wait for the .BAK file to be created and then move it to it's new location for a restore. Again this all comes down to is this a snapshot copy or a migration.
Backing up and restoring makes much more sense, even if you might eek out a few extra minutes from a detach attach option instead. You have to take the original database offline (disconnect everyone) prior to a detach, and then the db is unavailable until you reattach. You also have to keep track of all of the files, whereas with a backup all of the files are grouped. And with the most recent versions of SQL Server the backups are compressed.
And just to correct something: DB backups and differential backups do not truncate the log, and do not break the log chain.
In addition, the COPY_ONLY functionality only matters for the differential base, not for the LOG. All log backups can be applied in sequence from any backup assuming there was no break in the log chain. There is a slight difference with the archive point, but I can't see where that matters.
Solution 2 would be my choice... Primarily becuase it won't create any downtime on the source database. The only disadvatage i can see is that depending on the database recovery model, the transaction log will be truncated meaning if you wanted to restore any data from the transaction log you'd be stuffed, you'd have to use your backup file.
EDIT: Found a nice link; http://sql-server-performance.com/Community/forums/p/5838/35573.aspx

Database replication. 2 servers, Master database and the 2nd is read-only

Say you have 2 database servers, one database is the 'master' database where all write operations are performed, it is treated as the 'real/original' database. The other server's database is to be a mirror copy of the master database (slave?), which will be used for read only operations for a certain part of the application.
How do you go about setting up a slave database that mirrors the data on the master database? From what I understand, the slave/readonly database is to use the master db's transaction log file to mirror the data correct?
What options do I have in terms of how often the slave db mirrors the data? (real time/every x minutes?).
What you want is called Transactional Replication in SQL Server 2005. It will replicate changes in near real time as the publisher (i.e. "master") database is updated.
Here is a pretty good walk through of how to set it up.
SQL Server 2008 has three different modes of replication.
Transactional for one way read only replication
Merge for two way replication
Snapshot
From what I understand, the slave/readonly database is to use the master db's transaction log file to mirror the data correct?
What options do I have in terms of how often the slave db mirrors the data? (real time/every x minutes?).
This sounds like you're talking about log shipping instead of replication. For what you're planning on doing though I'd agree with Jeremy McCollum and say do transactional replication. If you're going to do log shipping when the database is restored every x minutes the database won't be available.
Here's a good walkthrough of the difference between the two. Sad to say you have to sign up for an account to read it though. =/ http://www.sqlservercentral.com/articles/Replication/logshippingvsreplication/1399/
The answer to this will vary depending on the database server you are using to do this.
Edit: Sorry, maybe i need to learn to look at the tags and not just the question - i can see you tagged this as sqlserver.
Transactional replication is real time.
If you do not have any updates to be done on your database , what you need is just retrieving of data say once a day : then use snapshot replication instead of transactional replication. In snapshot replication, changes will replicate when and as defined by the user say once in 24 hrs.

Resources