I would like to know which is the best way to make a copy and keep the copies synchronized of a on premises SQL Server 2008 (not R2) database to SQL Azure.
Think of the SQL Azure as a failover kind of structure...
Notes:
The database runs fine in SQL Azure
I have already figured out how to get the rest of the app running on Azure
Please consider suggestions of the type "Upgrade to SQL Server 2012 because of X" if the gain (reliability, efficiency, time to replicate, etc...) are worth it
I`m looking for instant replication (as fast as possible)
Yes it will have to sync back eventually. If the on-premises deploy crash and the cloud get activated and changed, sync back will be necessary, but i think it does not need to be automatic... of it is, better!
The Database consist of 900+ tables (legacy system)
http://www.windowsazure.com/en-us/manage/services/sql-databases/getting-started-w-sql-data-sync/
http://msdn.microsoft.com/en-us/library/hh456371.aspx
I think the best bet is to use SQL Data Sync, it should give you bidirectional and we use it currently to sync data around the world in terms of datacenters and one local on premise database. It will only give you 5 mins sync timing but this will probably do, otherwise the next best options is to use SQL Server VMs and do the old fashion way. But with SQL Azure Data Sync we have found to be reasonable reliable and been running it for a good six months syncing across 4 database in four data centres in Azure.
Some problems though with it,
It uses Triggers.
It will obivously add load and connections to your current SQL Database.
The new control panel in Azure is a nightmare for it, so I would use the old panel for the moment.
It is in preview last time I looked, so it might not be 100% suitable
for you.
I would imagine there is some better third party solutions out there but off the shelf and in Azure SQL Data sync is well worth a look for the situation you a describing.
Related
I'm completely stymied. Let me describe my situation.
We're a relatively small company and the vast majority of our operational data is contained in a vendor database. Our vendor offers a Data Warehousing service. They've taken all of our data and applied some OLAP-ish modeling to it. Each day, they place either a .bak or a .diff file (.bak once a week, .diff every other day) in a FTP endpoint that we pay to access. Currently, we use a PowerShell script to download this data to a server that we've got sitting at a local server farm, where we then use SQL Server to "rehydrate it" by restoring from it.
That's all fine and good, but we really want to move as many of our workloads into the cloud as possible (we use Azure). As far as I can tell, SQL Managed Instances are the only way we can restore from a .bak file in the cloud. This is waaaay more expensive than we need, and we really don't need the managed instance platform at all except to restore from this file.
Basically, everything about this current process is diametrically opposed to us moving it to the cloud, unless we want to pay even more than we are to rent out this server farm.
I'm trying to lobby them for a different method of getting their data, but I'm having trouble coming up with a method to propose. We need to, every day, transfer a ~40gb database from SQL Server (at our vendor) to Azure SQL (in our cloud). What's the least-intrusive way we could do this?
We are glad that you choose the Azure SQL on Azure VM as the solution. Thanks for the suggestions of Alex and Davaid too:
I've actually seen all of those resources already. The biggest
obstacle here is that the entire process has to be automated
end-to-end, which makes bacpac restores more difficult (they'd have
to write some sort of .NET app to back up to bacpac). I think SQL on Azure VM is the only real option, so I may have
to look at cost for that.
If others face the same scenario, we could reference this. This also can be beneficial to other community members.
SQL Azure has a database size limit of 150 gb. I have read through their documentation several times and also searched online but I'm unclear about this: Does using federations allow a developer to grow beyond a 150 gb data base? For example can I have several 150GB federation members.
If not, how can I handle a database larger than 150 gb on Windows Azure?
basically, How do I scale out beyond 150 gb on Windows Azure
If theres no other way is RDS a good alternative(share any other alternatives)
Currently it is not possible to have a single database larger than 150G.
The only approach is to either split the data into multiple databases, one account can have up to 149 user databases plus the master DB, or use SQL Azure Federations. Currently, if I am not mistaken, the total number of Federations supported is Int16.MaxValue - 1. Each federation is actually a separate database, transparent to the developer, which can be up to 150GB.
However, SQL Azure Federations has its own pros and cons, along with some data access layer re-factoring. If you are interested you may check out these cool videos on SQL Azure Federations:
Building Scalable Apps with SQL Azure
Using SQL Azure Database Federations
UPDATE
I will not completely agree with #ryancrawcour. What he explains is just the peak of the iceberg lying bellow the water. The amount of required re-factoring really depends on how data is consumed from the application. I will just mention a few factors for considerations (which are not complete picture at all). Consider any of the following:
Data that is common for all federations (how you get this data)
Stored proc, that post-processes data - you have to iterate in each and ever federation member and execute that stored proc. There is no way to execute the Stored proc once and process data in all the federations.
Aggregate data, which is spread across more than 1 federation member
List data from more than one federation member.
These are just few operations that you will need to consider, and that does not require "just change in connection string and execute one use federation ..." before each query. Actually using SQL Azure Federations you don't need to change the connection string at all. It is all the same SQL Azure connection string. The "USE FEDERATION ..." statement is what you have execute before each query. But it is way not just the only thing. And how about if one is using EntityFramework (model first, or code first, or whatever). Things get even more complicated and need real understanding of SQL Azure Federations.
I would say that SQL Azure Federations is different way of thinking about data, about modelling and normalizing.
UPDATE 2 - new Database sizes announced by Microsoft
As of 03. April 2014 the maximum size for a single Database has been increased to 500GB. The only available information to date is here. Be aware that the management portal still doesn't show this option (as of Today and now: 4. Apri 2014, 15:00 GMT+0:00).
I've been looking for these same answers a while ago. In addition to the answers Anton provided (which are very accurate), I found that you can make your WAVM with SQL Server installation redundant through load balancing and mirroring.
The advantage of WASD is that everything is automated. E.g. when your WAVM instance is taken out of the roulation of the load balancer, you'll need bring a new one up yourself. WASD takes care of all of this.
With WASD Federations you're able to scale to 75TB of data (if I remember correctly), while with WAVM with SQL Server you can scale to 16TB tops.
Also with WASD Federations you can more granularly divide the SQL Workloads.
Regards,
Patriek
There is also the new Azure feature of persistent VMs (currently in preview) which will allow you to migrate your on-premises applications to cloud with minimal changes.
Further reading: Infrastructure as a Service Series: Running SQL Server in a Windows Azure Virtual Machine
.This guide might be helpful as well.
Edit
Here is a comparison with Sql Azure
While considering your scale options, be aware that, as of April 3 2014, Microsoft announced upcoming changes to SQL Premium, including ability to scale each SQL Database instance to 500GB (along with geo-replication, self-service restore, and higher uptime SLA). No date has been announced yet, but you can read about the announcement details here.
There is now a 1 Terrabyte tier available - see https://azure.microsoft.com/en-us/pricing/details/sql-database/ and look at the Premium level.
I have just lost pretty much a day and a half trying to get pull replication going for an off site server. After experiencing the pain that was this experience I am now thinking it shouldn't be this hard so maybe I am doing it wrong. I never did get it to work, I had to go to push replication.
Here is the situation. We have a virtual server being hosted off site that will a database for a public web application. We want to push all the data from a few of our tables in our internal database to this off site location and it has to be done almost instantaneously so that the web information is current. We don't want to set up VPN because if that machine gets compromised we don't want that vulnerability.
If sql server replication is not the best method how would you do it?
FYI: Publisher = SQL Server 2005 & Subscriber = SQL Server 2008 Web Edition
Well, if you want it to be fast and easy to manage, one solution is to set a merge replication topology, having your main server as a publisher, and the hosted server as a suscriber. Replication can then be done through http, thus without vpn.
Be careful: web replication is not as straighforward as opening a page in your browser! You can find some interesting infos here
I do not have SQL Studio on this machine, but I guess you can parameter your subscription in such a way that only downloads will be replicated to the suscriber, while uploads will be ignored.
By running the replication script (it's a BAT file) from the suscriber every minute (through any scheduled tasks manager), you can have a quasi-instantaenous update of your suscriber's tables.
Ok, the scenario is... two servers, on completely different parts of the internet.
The sql 2008 database just needs to get data updates and schema changes. It doesn't need to send anything to the 2005 database. Basically just suck data and schema as efficiently as possible automatically as a scheduled task.
The database is quite huge.... but the changes per day are probablly around 20/30 megabytes of data/
I can't run any of the inbuilt replication on the 2005 database.
I've had a wee look at the Sync Framework, I think that might do what I want, but seems a bit painful and requires a bit of work to get going. I'm wondering if there is tooling out there to make this easier?
or?? not quite sure what my options are.
I can't run any of the inbuilt
replication on the 2005 database.
Any reason for this restriction? Replication is the way to solve your problem. W/o a replication infrastructure you simply won't be able to detect data changes, nor schema changes. There are only two ways to detect the changes: either via triggers and tracking tables (and that is Merge Replication) or via the database log (and that is Transactional Replication).
Sync Framework itself, if it would be used, would require either Change Tracking or Change Data Capture. But these are 2008 specific technologies and they're really nothing else but replication in disguise (they use the very same infrastructure used by Merge and respectively Transactional Replication).
Even if you want to roll your own, you'll find out quickly that shipping the changes over is the trivial part, eg. using Service Broker for reliable delivery semantics. But the Real hard problem is detecting the changes, and that is hard. Diff-ing a 'quite huge' database over the internet to detect changes is just not going to work. So relying on the built-in infrastructure to detect changes, namely the two forms of Replication, is just the obvious solution.
Could you automate RedGate's SQL Compare and/or SQL Data Compare? http://www.red-gate.com/products/SQL_Compare/index.htm ... you could at least try that out with the 14-day trial and see if it is worth the investment. Much cheaper than tooling it yourself, IMHO.
maybe these questions help you:
Microsoft Sync Framework Or Replication
SQL Server Data Archive Solution
Is there a way to replicate some data not all data in db by sql server replication?
you can make an application that generate a script from your changed data in your favorite period and then run this script in your target server.
I have two database servers running SQL Server 2005 Enterprise that I want to make one of them as mirror database server.
What I need is; to create an exact copy database from primary server on mirror server, so when the primary server was down, we could switch database IP on application to use mirror server.
I have examined "mirror" feature on SQL Server 2005, and based on this article:
http://aspalliance.com/1388_Database_Mirroring_in_Microsoft_SQL_Server_2005.all
The mirror database cannot be accessed directly; however snapshots of the mirror database can be taken for read only purposes. (Prerequisites no. 4)
So how it can be useful when I can't access it when primary server was down?
I've been thinking about creating a regular backup on primary server and restore it on mirror server on hourly basis, but that's quite inefficient (slow) especially if I want an exact copy (since hundreds data's are added once in minute).
Any other suggestion?
EDIT:
Maybe what I mean was a replication thing, not a mirror (thanks JP for commenting)
They are referring to the fact that you can't perform queries on the mirrored copy, but you can get around that limitation by creating a snapshot of the mirrored database. This is often done to create a read-only database copy for reporting uses. You would have full access of the mirror if the primary were to fail, but it will not failover automatically.
Log shipping is another option, which allows you to query (read-only) the standby database without having to create a snapshot.
If I understand your question correctly, you shouldn't have to do that. There are several role switching forms you can use to have your mirror take over as primary. You don't change the IP address at the application level, the cluster itself has a virtual IP address that allows access to the data at any given time (given a reasonable amount of time for the switch over to the mirror from a primary failure). The mirror stays in synch by itself. :) There are good articles here and here on clustering.
Edit: Okay, based on the comments, check out the various options for replication.
Your confusion is common - there's a lot of ways to do disaster recovery planning with SQL Server. I've recorded a 10-minute video tutorial of SQL Server disaster recovery options including log shipping, mirroring, replication and more. If you like that one, we've got a longer one at Quest called Disaster Recovery Techniques but that one requires registration.
Instead of investigating a specific technology here, what you might want to do is tell us what your needs are, and then we can help you find out what option is right for you. The videos will give you an idea of what kinds of information you need to know before selecting a particular solution.
When using only two SQL Servers, you need to do the fail-over manually. The 'backup' database will be usable after you do two things;
Disable mirroring on it
Restore the database with RECOVERY (but without a backup file, this will make the database usable).
Therefore mirroring in this manner does make scense, however it is hard to maintain;
Moving back from the backup database to the primary is a 'pain' as you have to set-up the complete mirroring again using a backup of the redundant server. This is needed to get the primary back up to speed.
My recommendation would be to get a thrid SQL Server into the picture that can act as a witness. The witness will monitor the status of the mirroring databases. Your bonus; you will get automatic failover, and will not have the fail-over (and after fail-over) issues.
If I remeber correct, the witness server can be running SQL Express so no need for the Enterprise version on all three - just the two where the actual mirroring will take place.
Let me know if you need Transact SQL for the commands to fail-over and 'anti-fail-over' in a two server scenario, and I can dig them up.