The fastest backup/restore strategy for Azure SQL databases? - sql-server

What is the fastest way to backup/restore Azure SQL database?
The background: We have the database with size ~40 GB and restoring it from the .bacbac file (~4GB of compressed data) in the native way by Azure SQL Database Import/Export Service takes up to 6-8 hours. Creating .bacpac is also very long and takes ~2 hours.
UPD:
UPD.
Creating the database (by the way transactional consistent) copy using CREATE DATABASE [DBBackup] AS COPY OF [DB] takes only 15 minutes with 40 GB database and the restore is simple database rename.
UPD. Dec, 2014. Let me share with you our experience about the fastest way of DB migration schema we ended up with.
First of all, the approach with data-tier application (.bacpac) turned out to be not viable for us after DB became slightly bigger and it also will not work for you if you have at least one non-clustered index with total size > 2 GB until you disable non-clustered indexes before export - it's due to Azure SQL transaction log limit.
We stick to Azure Migration Wizard that for data transfer just runs BCP for each table (parameters of BCP are configurable) and it's ~20% faster than approach with .bacpac.
Here are some pitfalls we encountered with the Migration Wizard:
We run into encoding troubles for non-Unicode strings. Make sure
that BCP import and export runs with same collation. It's -C ... configuration switch, you can find parameters with which BCP calling
in .config file for MW application.
Take into account that MW (at least the version that is actual at the moment of this writing) runs BCP with parameters that will leave the constraints in non-trusted state, so do not forget to check all non-trusted constraints after BCP import.

If your database is 40GB it's long past time to consider having a redundant Database server that's ready to go as soon as the main becomes faulty.
You should have a second server running alongside the main DB server that has no actual routines except to sync with the main server on an hourly/daily basis (depending on how often your data changes, and how long it takes to run this process). You can also consider creating backups from this database server, instead of the main one.
If your main DB server goes down - for whatever reason - you can change the host address in your application to the backup database, and spend the 8 hours debugging your other server, instead of twiddling your thumbs waiting for the Azure Portal to do its thing while your clients complain.
Your database shouldn't be taking 6-8 hours to restore from backup though. If you are including upload/download time in that estimate, then you should consider storing your data in the Azure datacenter, as well as locally.
For more info see this article on Business Continuity on MSDN:
http://msdn.microsoft.com/en-us/library/windowsazure/hh852669.aspx
You'll want to specifically look at the Database Copies section, but the article is worth reading in full if your DB is so large.

Azure now supports Point in time restore / Geo restore and GeoDR features. You can use the combination of these to have quick backup / restore. PiTR and Geo restore comes with no additional cost while you have to pay for
Geo replica

There are multiple ways to do backup, restore and copy jobs on Azure.
Point in time restore.
Azure Service takes full backups, multiple differential backups and t-log backups every 5 minutes.
Geo Restore
same as Point in time restore. Only difference is that it picks up a redundant copy from a different blob storage stored in a different region.
Geo-Replication
Same as SQL Availability Groups. 4 Replicas Async with read capabilities. Select a region to become a hot standby.
More on Microsoft Site here. Blog here.

Azure SQL Database already has these local replicas that Liam is referring to. You can find more details on these three local replicas here http://social.technet.microsoft.com/wiki/contents/articles/1695.inside-windows-azure-sql-database.aspx#High_Availability_with_SQL_Azure
Also, SQL Database recently introduced new service tiers that include new point-in-time-restore. Full details at http://msdn.microsoft.com/en-us/library/azure/hh852669.aspx

Key is to use right data management strategy as well that helps solve your objective. Wrong architecture and approach to put everything on cloud can prove disastrous... here's more to it to read - http://archdipesh.blogspot.com/2014/03/windows-azure-data-strategies-and.html

Related

copy Azure SQL database (PaaS) to IaaS (SQL server on VM)

Is it possible to use Create Database [] as copy of [] to create a copy of database that is hosted as Azure SQL database (PaaS) towards IaaS (SQL server on VM)?
Can you recommend an alternative of Import/Export that can limit the downtime of such transition?
Reason for this migration is the restriction of cross databases queries in PaaS mode that complicate one-time migration towards new database used in newer application version process
The answer depends on whether you want to copy database schema, data, or both.
As Jaxidian said, ApexSQL tools can do the job but as far I know DataDiff will only synchronize database data, while Diff will synchronize schema.
Here is the article describing processes of copying database data:
https://solutioncenter.apexsql.com/how-to-automatically-synchronize-the-data-in-two-sql-server-databases-on-a-schedule/
If you want to copy both schema and data, process is described here:
https://solutioncenter.apexsql.com/how-to-automatically-compare-and-synchronize-multiple-databases-on-different-sql-server-instances/
There are lots of tools available that can accomplish this. Which one is best for you depends on your needs. However, the "Copy" feature in the Azure Portal will not accomplish this for you but can be a partial solution to the approach you finalize on.
I'll make the following assumptions:
You have an always-on 24/7 production load so there are no regularly/nightly/weekly/monthly maintenance windows
You can schedule a maintenance window but you wish to keep it as small as possible
You can easily configure your applications' connectionstrings
Your database isn't huge. Gigabytes is fine.
Your database is mostly static data (i.e. an incremental approach is much faster than a dump-and-fill)
If I were to do this today/right now, my approach would be like this (this is only one option):
Use the Copy feature to make a copy of the database that I can use this as a staging area/reference point while minimizing the load on the Production database
Create a backup (bacpac file) from the copied database
Restore the bacpac file onto your IaaS-hosted SQL Server to form your base deployment
Start your maintenance window and effectively put your database into read-only mode so the data is now no longer changing (lots of strategies on how to do this whether you turn applications off, revoke permissions, etc.)
Use a tool such as ApexSQL Data Diff (Redgate and others have options) to compare data between the two databases and sync the data over to the new IaaS DB. Be careful - depending on your data needs you may have to tweak the generated scripts that sync the data.
Verify that the new DB is now indeed a duplicate copy of your old DB (ApexSQL Data Diff can also help with this - several options exist here)
Change connectionstrings on your apps to point to the new DB
Turn applications back on and end your maintenance window.
So of course, if you do something like this, practice it numerous times and test the results numerous times well before your maintenance window. Get a good idea of the timing for everything, especially how long it will take for you to generate and restore the bacpac file. This is because you want to do that as late as possible before your maintenance window to minimize the time it takes to generate and run the final "Data Diff" script that you'll use. The longer that script takes, the longer your outage will be.

SQL Azure Backup: What does transactionally consistent mean?

I'm using redgate's sql azure backup tool: http://www.red-gate.com/products/dba/sql-azure-backup/
It looks like if you check "Make Backup Transactionally Consistent" you get charged a full day's use for sql server. I'm wondering if I need to check this.
I do daily backups to blob storage and I backup the database to my local machine to work with every 3 days or so.
If I don't check the Transactionally Consistent box, am I going to run into any problems?
Well as the person who wrote SQL Azure Backup at Red Gate I can say that the only way to create a guaranteed transactionally consistent backup in Azure currently is indeed to use CREATE DATABASE ... AS COPY OF. This copy only exists for the duration of us taking the backup and is then dropped immediately afterwards.
If you don't check the box you'll only hit problems if there is a risk of transactions being in an inconsistent state when reading the data from each table in turn. CREATE COPY OF can take a very long time and also may cost money for the copy too.
If you're backing up to a BLOB you're using the Microsoft Import Export service rather than SQL Compare and SQL Data Compare technology but that also reads data from the tables to could be inconsistent too.
Hope this helps
Richard
AFAIK transactionally consistent means that you get a snapshot of the database at a point in time (which presumably means SQL Azure locks the db while (quickly we hope) it makes a copy of the entire database = your one day charge for a db that exists for only a few minutes).
This is better illustrated by non-transactionally consistent backup where begin by copying table X. While you are doing that someone amends (as it's a live database) table Y, which later gets copied to the backup. The foreign keys between X and Y might now not match 'cos X is from an earlier time period than Y.
I have used Sql Azure Backup and I did go for transactional consistency because the backups are for an emergency and the last thing I want in that scenario is inconsistencies in the data.
edit: now I think about it, Redgate should really state that if you backup every day you are effectively paying twice the rate for your database. I've been waiting for the sync framework which I think is there now...
To answer the question in the title: a SQL Azure database copy (the 'backup') is a SQL Azure database that is copied (fully online) from the source database and contains no uncommitted transactions (ie. is transactionally consistent). This is achieved the same way database snapshots or backup restores achieve consistency on the standalone SQL Server product: all pending transactions at the moment of 'separation' are rolled back.
As to why or how RedGate's product utilizes this, I don't know. I would venture a guess that in order to achieve a 'transitionally consistent backup' they are doing a CREATE DATABASE ... AS COPY OF ... (which creates the desired transactionally consistence) and then they use the technology from SQL Compare and Data Compare to copy out the schema and data.

Database mirroring/Replication, SQL Server 2005

I have two database servers running SQL Server 2005 Enterprise that I want to make one of them as mirror database server.
What I need is; to create an exact copy database from primary server on mirror server, so when the primary server was down, we could switch database IP on application to use mirror server.
I have examined "mirror" feature on SQL Server 2005, and based on this article:
http://aspalliance.com/1388_Database_Mirroring_in_Microsoft_SQL_Server_2005.all
The mirror database cannot be accessed directly; however snapshots of the mirror database can be taken for read only purposes. (Prerequisites no. 4)
So how it can be useful when I can't access it when primary server was down?
I've been thinking about creating a regular backup on primary server and restore it on mirror server on hourly basis, but that's quite inefficient (slow) especially if I want an exact copy (since hundreds data's are added once in minute).
Any other suggestion?
EDIT:
Maybe what I mean was a replication thing, not a mirror (thanks JP for commenting)
They are referring to the fact that you can't perform queries on the mirrored copy, but you can get around that limitation by creating a snapshot of the mirrored database. This is often done to create a read-only database copy for reporting uses. You would have full access of the mirror if the primary were to fail, but it will not failover automatically.
Log shipping is another option, which allows you to query (read-only) the standby database without having to create a snapshot.
If I understand your question correctly, you shouldn't have to do that. There are several role switching forms you can use to have your mirror take over as primary. You don't change the IP address at the application level, the cluster itself has a virtual IP address that allows access to the data at any given time (given a reasonable amount of time for the switch over to the mirror from a primary failure). The mirror stays in synch by itself. :) There are good articles here and here on clustering.
Edit: Okay, based on the comments, check out the various options for replication.
Your confusion is common - there's a lot of ways to do disaster recovery planning with SQL Server. I've recorded a 10-minute video tutorial of SQL Server disaster recovery options including log shipping, mirroring, replication and more. If you like that one, we've got a longer one at Quest called Disaster Recovery Techniques but that one requires registration.
Instead of investigating a specific technology here, what you might want to do is tell us what your needs are, and then we can help you find out what option is right for you. The videos will give you an idea of what kinds of information you need to know before selecting a particular solution.
When using only two SQL Servers, you need to do the fail-over manually. The 'backup' database will be usable after you do two things;
Disable mirroring on it
Restore the database with RECOVERY (but without a backup file, this will make the database usable).
Therefore mirroring in this manner does make scense, however it is hard to maintain;
Moving back from the backup database to the primary is a 'pain' as you have to set-up the complete mirroring again using a backup of the redundant server. This is needed to get the primary back up to speed.
My recommendation would be to get a thrid SQL Server into the picture that can act as a witness. The witness will monitor the status of the mirroring databases. Your bonus; you will get automatic failover, and will not have the fail-over (and after fail-over) issues.
If I remeber correct, the witness server can be running SQL Express so no need for the Enterprise version on all three - just the two where the actual mirroring will take place.
Let me know if you need Transact SQL for the commands to fail-over and 'anti-fail-over' in a two server scenario, and I can dig them up.

Continuous database backups?

I have the following scenario:
Our system is running a SQL Server Express 2005 database locally (on each users desktop, if you will). The system is storing a lot of production data from a machine. There are high demands on the safety of the data, and doing a backup each night, or even each hour is not enough. We need a backup strategy that will ensure almost instantaneous/continuous backup of the database.
Is there anyone out there that has successfully implemented a system similar to this, and/or has got some ideas of how to accomplish it? The only thing I can think of right now is to have mirrored drives (raid) to hold the data, but that would be complicated and expensive.
I would appreciate any and all thoughts on this, since it is a real issue for me and my company. Thanks in advance!
Update:
I was not clear enough in my description of the scenario. The system is storing data in a vehicle that has no connection to anything. A centralized database is therefor not possible. Neither can we use a standard/enterprise version of SQL Server, since it would be to expensive (each vehicle would need a license). Thanks for your input!
Switch your database into "Full" recovery mode. Do full backup every night and do delta backup after major user action. The delta backups can be done to the flash memory or different hard-drive, and all data can be synchronized with server when online.
Another simple way is to trace all user changes and important data in a text file that stored on a separate drive. If SQL database crashes the user or other operator can repeat steps to restore data.
One way I've seen this done is by using DoubleTake.
I will assume that a central database on a server is not feasible because your systems are running standalone and are not connected to anything. So this is what I would do
Set up RAID on the computer. This insures you against simple disk failure.
Any SQLSever database can be recovered to the point of the last commited transaction if you have a full database backup and a set of transaction logs available. Basically you simply restore the last full backup then apply the transaction logs going forward. See these links.
http://www.enterpriseitplanet.com/storage/features/article.php/11318_3776361_3
https://web.archive.org/web/1/http://blogs.techrepublic%2ecom%2ecom/datacenter/?p=132
So what you need to do is set up a periodic full backup of both the database and transaction logs, and more regular transaction log backups (and ensure that your transaction log can never run out of space).
In the event of failure you restore the last full backup, then apply the transaction logs going forward.
Myself, if these are critical systems, I would be inclined to add an additional drive to the system and make sure that the backups are copied over to that. This is because as good as raid is it does sometimes have issues - raid controllers fail, disks get wiped accidentally in parallel, disk failures go unnoticed so your just running on one disk etc. If you ensure backups are copied to a separate disk then you can always recover to the last transaction log backup. You should also ensure tape backups of course, but they are generally a last resort in the event of trouble.
If for some reason you cannot set up raid then you should still install a second disk, but place the database file on one drive and the transaction log on the other and copy backups to both disks. In the event of failure of the C drive, or some other software issue crashing the database you can still recover to the last commited transaction. Failure on the D drive limits you to the last transaction log backup (Oracle used to allow you to mirror the transaction log from the database, which again would completely cover you, but I don't think this facility exists in SQL Server)
If you are looking for a scheduler for SQL Server Express (which doesn't come with one) then I've been using SQLScheduler quite happily without problems, and it's free.
The most obvious answer would be to ditch SQL Server Express running locally and use a single source for your data (such as a standard SQL server install on a central storage location). Unless your system requires individual back ups of every single person's own individual instance of SQL Server Express.
If your requirements are so stringent as to call for instantaneous backups on every operation, you should definitely think about a different method of storage than local instances of SQL Server Express.
Wouldn't it be easier to just use one centralized SQL Server and back that up every hour or so? If you truly need instantaneous backup, your company (which seems not to want to spend money by installing Express on each machine) will need to spring for two servers and two SQL Server Enterprise licenses to implement Mirroring.
Raid isn't that expensive, but it is also not the best option. If you really want high availability data you should upgrade to sql server standard on a remote server where each user connects to and use transaction based replication to an sql server (express) instance on another machine. Raid doesn't always protect you from dataloss. If the data is that important for you then the costs should not be that much of an issue.
Update in response to the question update.
If you can't use remote servers then there a couple of options:
You write a trigger which initiates a backup script on each insert or update and stores it on a seperate harddrive.
You use raid. But beware that if the raid controller fails that you still got a problem.
RAID is not expensive. Use RAID to protect against hard drive failure. You also need monitoring though. No point in having this if you let both drives fail.
Also, implement hourly incremental backups, then daily incremental backups and finally weekly full backups.
You need all of these strategies working together because they protect against different things. RAID does not protect against human or coding errors destroying data. Hourly and weekly backups don't protect against hard drive failure.

SQL Server 2005 Backup strategy

I manage a web application for a client with the following specs:
ASP.net 3.5 running on a Virtual Windows 2003 Web Server
SQL Server Standard hosting the database
Database current size of 6Gb, with 1Gb/month growth rate
One single table is responsible for 98% of the size, holds the most critical data for the client
Log is not kept for this big table, only selects are done in this table
50 Gb FTP space avaiable for backup
Considering this scenario, what would be the best strategy for a SQL Backup and what tool would be best suited for this task (commercial applications included, client can pay for the license fee)?
Here is the strategy we use for CodePlex.com:
All SQL servers run with a peer server using SQL mirroring
Weekly full backup (stored on separate drive from databases)
Daily differential backup (stored on separate drive from databases)
Transaction log backup every 5 minutes (stored on separate drive from databases)
Daily tape backup
Tape backups taken offsite weekly
Also very important test your backups! Studies have shown that over 30% of untested backup procedures are flawed. Here is our backup testing strategy:
Every 30 minutes verify the full backup file exists (using scheduled task)
Every 30 minutes verify the differential backup file exists (using scheduled task)
Every 30 minutes verify the transaction log backup file exists (using scheduled task)
Every 30 minutes verify the database mirroring is configured (using scheduled task)
Every day, do a test restore of the full+differential backup and report the table row counts (using scheduled task)
Once a month do a test restore of the most recent tape backup and verify the data
It depends how critical is the data. Here is however how I i'd do it.
1. Run a full backup every day.
2. Run a differential backup every 4 hours.
3. Run a transactional log backup every 15 minutes
4. Keep a copy at the site and move a copy off the site as well as soon as the backup is done.
The database is not too big, and this is easily doable.
Use a third party tool like Redgate SQL Backup and it will automatically compress and encrypt the database backup for you. I have used it extensively and am a big fan.
Additionally if you another site available, and the data is very critical, you might want to think about setting up log shipping as well.
This is a VPC? Can you install apps?
http://www.jungledisk.com/
That's what we use - make a sql job that pushes out a backup every day, then use that service to push a copy back to Amazons S3 service. If not maybe you could have a local app that pulls the backup to a machine then pushes it /w S3 webservice, or still using Jungledisk.
This is important! If your app goes down it hurts! Also make sure you backup your deployed app and resources stored there... i.e. uploaded content to your apps storage directory.
I was supposed to type in my answer to your question but I realized there are lots of far greater resources somewhere like this article in SQLServerCentral.com. You can also find lots of "Best Practices on Backup" like this one.
You might also want to take into consideration how much data you can afford to lose and how long it will take you to restore the database. Your client may decide that they never want to lose more than 15 minutes of data ever, or they may decide that losing up to a days worth of data is okay with them.

Resources