I'm completely stymied. Let me describe my situation.
We're a relatively small company and the vast majority of our operational data is contained in a vendor database. Our vendor offers a Data Warehousing service. They've taken all of our data and applied some OLAP-ish modeling to it. Each day, they place either a .bak or a .diff file (.bak once a week, .diff every other day) in a FTP endpoint that we pay to access. Currently, we use a PowerShell script to download this data to a server that we've got sitting at a local server farm, where we then use SQL Server to "rehydrate it" by restoring from it.
That's all fine and good, but we really want to move as many of our workloads into the cloud as possible (we use Azure). As far as I can tell, SQL Managed Instances are the only way we can restore from a .bak file in the cloud. This is waaaay more expensive than we need, and we really don't need the managed instance platform at all except to restore from this file.
Basically, everything about this current process is diametrically opposed to us moving it to the cloud, unless we want to pay even more than we are to rent out this server farm.
I'm trying to lobby them for a different method of getting their data, but I'm having trouble coming up with a method to propose. We need to, every day, transfer a ~40gb database from SQL Server (at our vendor) to Azure SQL (in our cloud). What's the least-intrusive way we could do this?
We are glad that you choose the Azure SQL on Azure VM as the solution. Thanks for the suggestions of Alex and Davaid too:
I've actually seen all of those resources already. The biggest
obstacle here is that the entire process has to be automated
end-to-end, which makes bacpac restores more difficult (they'd have
to write some sort of .NET app to back up to bacpac). I think SQL on Azure VM is the only real option, so I may have
to look at cost for that.
If others face the same scenario, we could reference this. This also can be beneficial to other community members.
Related
Is it possible to use Create Database [] as copy of [] to create a copy of database that is hosted as Azure SQL database (PaaS) towards IaaS (SQL server on VM)?
Can you recommend an alternative of Import/Export that can limit the downtime of such transition?
Reason for this migration is the restriction of cross databases queries in PaaS mode that complicate one-time migration towards new database used in newer application version process
The answer depends on whether you want to copy database schema, data, or both.
As Jaxidian said, ApexSQL tools can do the job but as far I know DataDiff will only synchronize database data, while Diff will synchronize schema.
Here is the article describing processes of copying database data:
https://solutioncenter.apexsql.com/how-to-automatically-synchronize-the-data-in-two-sql-server-databases-on-a-schedule/
If you want to copy both schema and data, process is described here:
https://solutioncenter.apexsql.com/how-to-automatically-compare-and-synchronize-multiple-databases-on-different-sql-server-instances/
There are lots of tools available that can accomplish this. Which one is best for you depends on your needs. However, the "Copy" feature in the Azure Portal will not accomplish this for you but can be a partial solution to the approach you finalize on.
I'll make the following assumptions:
You have an always-on 24/7 production load so there are no regularly/nightly/weekly/monthly maintenance windows
You can schedule a maintenance window but you wish to keep it as small as possible
You can easily configure your applications' connectionstrings
Your database isn't huge. Gigabytes is fine.
Your database is mostly static data (i.e. an incremental approach is much faster than a dump-and-fill)
If I were to do this today/right now, my approach would be like this (this is only one option):
Use the Copy feature to make a copy of the database that I can use this as a staging area/reference point while minimizing the load on the Production database
Create a backup (bacpac file) from the copied database
Restore the bacpac file onto your IaaS-hosted SQL Server to form your base deployment
Start your maintenance window and effectively put your database into read-only mode so the data is now no longer changing (lots of strategies on how to do this whether you turn applications off, revoke permissions, etc.)
Use a tool such as ApexSQL Data Diff (Redgate and others have options) to compare data between the two databases and sync the data over to the new IaaS DB. Be careful - depending on your data needs you may have to tweak the generated scripts that sync the data.
Verify that the new DB is now indeed a duplicate copy of your old DB (ApexSQL Data Diff can also help with this - several options exist here)
Change connectionstrings on your apps to point to the new DB
Turn applications back on and end your maintenance window.
So of course, if you do something like this, practice it numerous times and test the results numerous times well before your maintenance window. Get a good idea of the timing for everything, especially how long it will take for you to generate and restore the bacpac file. This is because you want to do that as late as possible before your maintenance window to minimize the time it takes to generate and run the final "Data Diff" script that you'll use. The longer that script takes, the longer your outage will be.
I'm looking for a little advice.
I have some SQL Server tables I need to move to local Access databases for some local production tasks - once per "job" setup, w/400 jobs this qtr, across a dozen users...
A little background:
I am currently using a DSN-less approach to avoid distribution issues
I can create temporary LINKS to the remote tables and run "make table" queries to populate the local tables, then drop the remote tables. Works as expected.
Performance here in US is decent - 10-15 seconds for ~40K records. Our India teams are seeing >5-10 minutes for the same datasets. Their internet connection is decent, not great and a variable I cannot control.
I am wondering if MS Access is adding some overhead here than can be avoided by a more direct approach: i.e., letting the server do all/most of the heavy lifting vs Access?
I've tinkered with various combinations, with no clear improvement or success:
Parameterized stored procedures from Access
SQL Passthru queries from Access
ADO vs DAO
Any suggestions, or an overall approach to suggest? How about moving data as XML?
Note: I have Access 7, 10, 13 users.
Thanks!
It's not entirely clear but if the MSAccess database performing the dump is local and the SQL Server database is remote, across the internet, you are bound to bump into the physical limitations of the connection.
ODBC drivers are not meant to be used for data access beyond a LAN, there is too much latency.
When Access queries data, is doesn't open a stream, it fetches blocks of it, wait for the data wot be downloaded, then request another batch. This is OK on a LAN but quickly degrades over long distances, especially when you consider that communication between the US and India has probably around 200ms latency and you can't do much about it as it adds up very quickly if the communication protocol is chatty, all this on top of the connection's bandwidth that is very likely way below what you would get on a LAN.
The better solution would be to perform the dump locally and then transmit the resulting Access file after it has been compacted and maybe zipped (using 7z for instance for better compression). This would most likely result in very small files that would be easy to move around in a few seconds.
The process could easily be automated. The easiest is maybe to automatically perform this dump every day and making it available on an FTP server or an internal website ready for download.
You can also make it available on demand, maybe trough an app running on a server and made available through RemoteApp using RDP services on a Windows 2008 server or simply though a website, or a shell.
You could also have a simple windows service on your SQL Server that listens to requests for a remote client installed on the local machines everywhere, that would process the dump and sent it to the client which would then unpack it and replace the previously downloaded database.
Plenty of solutions for this, even though they would probably require some amount of work to automate reliably.
One final note: if you automate the data dump from SQL Server to Access, avoid using Access in an automated way. It's hard to debug and quite easy to break. Use an export tool instead that doesn't rely on having Access installed.
Renaud and all, thanks for taking time to provide your responses. As you note, performance across the internet is the bottleneck. The fetching of blocks (vs a continguous DL) of data is exactly what I was hoping to avoid via an alternate approach.
Or workflow is evolving to better leverage both sides of the clock where User1 in US completes their day's efforts in the local DB and then sends JUST their updates back to the server (based on timestamps). User2 in India, also has a local copy of the same DB, grabs just the updated records off the server at the start of his day. So, pretty efficient for day-to-day stuff.
The primary issue is the initial DL of the local DB tables from the server (huge multi-year DB) for the current "job" - should happen just once at the start of the effort (~1 wk long process) This is the piece that takes 5-10 minutes for India to accomplish.
We currently do move the DB back and forth via FTP - DAILY. It is used as a SINGLE shared DB and is a bit LARGE due to temp tables. I was hoping my new timestamped-based push-pull of just the changes daily would have been an overall plus. Seems to be, but the initial DL hurdle remains.
I would like to know which is the best way to make a copy and keep the copies synchronized of a on premises SQL Server 2008 (not R2) database to SQL Azure.
Think of the SQL Azure as a failover kind of structure...
Notes:
The database runs fine in SQL Azure
I have already figured out how to get the rest of the app running on Azure
Please consider suggestions of the type "Upgrade to SQL Server 2012 because of X" if the gain (reliability, efficiency, time to replicate, etc...) are worth it
I`m looking for instant replication (as fast as possible)
Yes it will have to sync back eventually. If the on-premises deploy crash and the cloud get activated and changed, sync back will be necessary, but i think it does not need to be automatic... of it is, better!
The Database consist of 900+ tables (legacy system)
http://www.windowsazure.com/en-us/manage/services/sql-databases/getting-started-w-sql-data-sync/
http://msdn.microsoft.com/en-us/library/hh456371.aspx
I think the best bet is to use SQL Data Sync, it should give you bidirectional and we use it currently to sync data around the world in terms of datacenters and one local on premise database. It will only give you 5 mins sync timing but this will probably do, otherwise the next best options is to use SQL Server VMs and do the old fashion way. But with SQL Azure Data Sync we have found to be reasonable reliable and been running it for a good six months syncing across 4 database in four data centres in Azure.
Some problems though with it,
It uses Triggers.
It will obivously add load and connections to your current SQL Database.
The new control panel in Azure is a nightmare for it, so I would use the old panel for the moment.
It is in preview last time I looked, so it might not be 100% suitable
for you.
I would imagine there is some better third party solutions out there but off the shelf and in Azure SQL Data sync is well worth a look for the situation you a describing.
So I'm thinking of dipping a toe in the Azure pool
Our web App Suite will soon be a pure ASP.Net + SQL Server affair
For various reasons it will be simpler to initially create a SQL VM and run everything from there initially.
How hard will it be to ...
...migrate SQL off the VM and into either "Cloud Services" or "Data Management"?
...migrate the suite of WebApps off the VM and into "Websites"?
It is my understanding that having achieved this migration, the OS level updates will no longer be my concern as they will be handled by the service. Thus at this point I'll be able to throw the original VM away :)
This isn't exactly answering your questions, but it might help educate you on more questions to ask and giving you a boost out of the gate. These were all lessons learned before, during, or after our migration of our systems to Azure. Now that we're up there, we have a ~50GB database with ~6 services running across ~30 instances. As long as our database backup behaves, total amount of effort in upgrading all of this is less than an hour (and could be much less if we didn't have many safeguards meant to force us to be aware of what's going on during the migration process - we don't want it to be too easy to deploy just to protect us from ourselves).
Preparing to migrate your system to Azure:
If you're planning to go to Azure, you first need to make sure your architectures and technologies are compatible. This isn't to say you have to code everything specific to Azure. This means some of the following things:
You should realize that "high availability" does not mean "error-free". In fact, high-availability environments usually have more errors that you have to handle and manage. For example, if you have a request going over the network to a server that just had a motherboard fry and was taken offline, that network request will be unsuccessful. That's not typically a problem you code for in "standard" server apps. To take it even further, what if that failed network is for a Database Connection that gets put back into a connection pool? That will cause that connection to be poisoned and broken the next time somebody pulls it out regardless of the future state of that server that went poof! There are just some extra things to worry about here because you're no longer depending on just 1 network with 4 servers on it but are now depending on hundreds of networks with thousands of servers on them. That 0.05% error scenario will happen MUCH more often to you than you've ever experienced in the past and you really have to be aware of this!
You should use dependency-injection to easily change things around. Proper separation of concerns will changes that seem very difficult become very easy in Azure.
You should use architectures for "high-availability". For example, a web application that would break when ran in a web farm would also break in Azure but a web application designed to work in a web farm would be very easy to run in Azure.
You should have automated deployments and configuration transforms for all of your applications. Anything else is just unsustainable unless it's nothing more than one little web site or something like that.
Depending on your needs, you can do it in phases. If database latency is something that isn't a big deal, perhaps a hybrid approach (over VPN from Azure to your data center) is acceptable to get your apps in Azure first while you later migrate your database. Or perhaps the opposite. What we did was keep primary apps and database in our data center but put secondary apps up in Azure first. Then some primary apps (that took a performance hit for a month until) later our database and critical apps. That final migration sure made for a very long weekend and not much sleep, but it is SOO much nicer now that we're done!
Migrating your applications to Azure:
This ultimately depends very heavily on what your application is or does, and every scenario has different steps/issues/benefits. I'm not going to cover this deeply other than to say, "Use Google, it's your friend!". Beyond that, for us, getting our applications up into Azure was the largest payoff when compared to our data. The ROI on our app migration was less than a month between hosting costs, licensing, and management effort. Instead of taking a couple days to setup a server, I can now take a day to setup an entirely new and duplicated environment of all of our SaaS applications/databases/etc and have them running on ~25 different Cloud instances!
Instead of trying to tell you how to migrate these, let me give you a few words of caution so you know sooner rather than later:
If you have app problems in Windows 2012, humor me and try it in Windows 2008 R2. There are a couple bugs in some of the 2012 images that they've prepared. It's incredibly trivial to switch back and forth!
Go make your logging 1000x's better than what it is now. If you don't do that now, you'll regret it.
Don't depend 100% on the easy-to-implement "Azure Logging". It works well enough but it more-or-less requires your applications to start successfully and is absolutely useless in debugging startup problems. If you don't have an alternative, then you will waste many, many hours just debugging stupid little problems when your app starts up. By the time you're done with it, you could have easily added 5 other logging frameworks and had an amazingly awesome logging system in place plus a running app instead of nothing but a running app to show for the same amount of time. Really, do this! (Profiling is a good idea as well, although Mini-Profiler has load-balancing issues if you have multiple instances.)
If you add new endpoints to your deployment (ports, etc), you cannot simply "Upgrade" an existing deployment. You must delete it (the deployment, not the service) and install from scratch. You can delete the Staging one, deploy to Staging, then swap.
If you have WCF apps, pretend you don't know about Windows Activation Services. They're disabled in Azure by default. You can either hack them to turn them on (startup scripts) or create your own self-hosting application. We self-host so we can more easily tweak service configurations once we're deployed (it's not easy to edit web.config files in a way that sticks in Azure). Web services work in "IIS" in Azure but TCP, named pipes, etc. do not.
Go learn about and add the Transient Errors Application Block (or something equivalent) to anything you communicate with. If you don't do it now, you'll regret it.
Go make your logging better! Really, really, REALLY do this!
Migrating your SQL Server Database to Azure:
Getting your database up into Azure is a bit of a painful process. There isn't a quick and easy way to just get it up there and making it work. Some people have to make some major changes while others just have to tweak a few ignorable things here and there. However, no matter how large or small your database is, you really REALLY must devote a lot of time to testing it. Test your migration process. Test your scripts to prepare your database. Test the performance and stability of the database up in the cloud. Test your backup procedures. Test your upgrade procedures. Test your backup restoration procedures. Test ALL of this because I guarantee you that you will find some surprises!
Schema:
Go learn about all of the limitations of SQL Azure. No Heaps, etc. Learn them before you start! Go learn them now! They're all mostly to very reasonable.
Be aware of the 2GB T-Log limitation! This means some very large indexes can never be rebuilt! (that said, our 30GB table isn't yet hitting this)
To deploy your schema, go into SSMS for your local db and use the "Tasks -> Extract Data-tier Application..." feature (it's in different areas of the menu in different versions of SSMS). Take this file and go into SSMS for your Azure database and use the "Deploy Data-tier Application" feature. (This will help you catch some of the Azure limitations you aren't honoring if this process fails.) This is, by far, the easiest way to get an empty version of your database up into Azure.
Use a tool like Redgate SQL Compare to verify your work (you'll have to tweak a couple options, like WITH NOCHECK to get a clean comparison).
You'll have to cleanup users, schemas, broken sprocs, etc. before you succeed at this. (this is a good thing!)
Data:
Go learn about all of the limitations of SQL Azure. Learn them before you start! Go learn them now! They're all mostly to very reasonable.
Go download the Azure Database Migration Wizard from Codeplex (or wherever the latest version is). It's not the most amazing software (kinda unstable) but even if it crashes once or twice on you, it'll still save you a LOT of time!
I strongly recommend RedGate's SQL Data Compare. The previously-mentioned migration wizard will help you identify problems (it's on you to fix those) and will get you ~98% migrated but you'll want to come back and clean up after it. It has some bugs that misses nullable BIT fields (and upper ascii characters) and some other things that a tool like SQL Data Compare can easily identify and fix. It can also give you the peace-of-mind that you can depend on your database.
If your database is large, consider spinning up a temporary VM in Azure (they have them with SQL Server pre-installed and available in ~20 minutes) to do your migrations from. If you do this, it's best to upload a compressed database backup to Blob storage (Cerebrata's storage too is great for this) and then grab it and restore to SQL Server in that VM. Then stage your migrations all from there!
Test, test, TEST!!!
Be careful running SQL on a VM, it's not a high availability solution. Azure VMs are prone to restarting from time to time. Unless you have multiple VMs running SQL Server in an availability group, or you have some sort of mirroring and load balancing setup, you won't have a high availability solution. I too originally favoured the IaaS to PaaS route, in the end it seemed to be a false economy as migrating IaaS to PaaS is about as much work as migrating on-premise to PaaS. In the end I decided to take the time to optimise my application for PaaS, i.e. moving durable storage to blobs, implementing transient error handling and retry logic, etc.
What you're proposing is certainly possible but having a multi VM arrangement to deliver high availability SQL takes a bit of work and is expensive! Have a read of the following guide, it was really helpful to me when I started the migration process:
Top 7 Concerns of Migrating a .NET Application to Azure
Just yesterday Microsoft announced their plan to host also Iaas solutions and not only Saas solution on their Azure platform.
http://weblogs.asp.net/scottgu/archive/2013/04/16/windows-azure-general-availability-of-infrastructure-as-a-service-iaas.aspx
About migration, it really depends. We work with a distribution mechanism: TFS + Octopus so the deployment is very easy and it works on Iaas or SQL Azure, it doesn't really matter.
There are also other things to keep into consideration when moving into Saas. Probably your code should be refactored if it's not Saas oriented or your application may have a very high hosting cost over Azure.
We have literally 100's of Access databases floating around the network. Some with light usage and some with quite heavy usage, and some no usage whatsoever. What we would like to do is centralise these databases onto a managed database and retain as much as possible of the reports and forms within them.
The benefits of doing this would be to have some sort of usage tracking, and also the ability to pay more attention to some of the important decentralised data that is stored in these apps.
There is no real constraints on RDBMS (Oracle, MS SQL server) or the stack it would run on (LAMP, ASP.net, Java) and there obviously won't be a silver bullet for this. We would like something that can remove the initial grunt work in an automated fashion.
We upsize (either using the upsize wizard or by hand) users to SQL server. It's usually pretty straight forward. Replace all the access tables with linked tables to the sql server and keep all the forms/reports/macros in access. The investment in access isn't lost and the users can keep going business as usual. You get reliability of sql server and centralized backups. Keep in mind - we’ve done this for a few large access databases, not hundreds. I'd do a pilot of a few dozen and see how it works out.
UPDATE:
I just found this, the sql server migration assitant, it might be worth a look:
http://www.microsoft.com/sql/solutions/migration/default.mspx
Update: Yes, some refactoring will be necessary for poorly designed databases. As for how to handle access sprawl? I've run into this at companies with lots of technical users (engineers esp., are the worst for this... and excel sprawl). We did an audit - (after backing up) deleted any databases that hadn't been touched in over a year. "Owners" were assigned based the location &/or data in the database. If the database was in "S:\quality\test_dept" then the quality manager and head test engineer had to take ownership of it or we delete it (again after backing it up).
Upsizing an Access application is no magic bullet. It may be that some things will be faster, but some types of operations will be real dogs. That means that an upsized app has to be tested thoroughly and performance bottlenecks addressed, usually by moving the data retrieval logic server-side (views, stored procedures, passthrough queries).
It's not really an answer to the question, though.
I don't think there is any automated answer to the problem. Indeed, I'd say this is a people problem and not a programming problem at all. Somebody has to survey the network and determine ownership of all the Access databases and then interview the users to find out what's in use and what's not. Then each app should be evaluated as to whether or not it should be folded into an Enterprise-wide data store/app, or whether its original implementation as a small app for a few users was the better approach.
That's not the answer you want to hear, but it's the right answer precisely because it's a people/management problem, not a programming task.
Oracle has a migration workbench to port MS Access systems to Oracle Application Express, which would be worth investigating.
http://apex.oracle.com
So? Dedicate a server to your Access databases.
Now you have the benefit of some sort of usage tracking, and also the ability to pay more attention to some of the important decentralised data that is stored in these apps.
This is what you were going to do anyway, only you wanted to use a different database engine instead of NTFS.
And now you have to force the users onto your server.
Well, you can encourage them by telling them that you aren't going to overwrite their data with old backups anymore, because now you will own the data, and you won't do that anymore.
Also, you can tell them that their applications will run faster now, because you are going to exclude the folder from on-access virus scanning (you don't do that to your other databases, which is why they are full of sql-injection malware, but these databases won't be exposed to the internet), and planning to turn packet signing off (you won't need that on a dedicated server: it's only for people who put their file-share on their domain-server).
Easy upgrade path, improved service to users, greater centralization and control for IT. Everyone's a winner.
Further to David Fenton's comments
Your administrative rule will be something like this:
If the data that is in the database is just being used by one user, for their own work (alone), then they can keep it in their own network share.
If the data that is in the database is for being used by more than one person (even if it is only two), then that database must go on a central server and go under IT's management (backups, schema changes, interfaces, etc.). This is because, someone experienced needs to coordinate the whole show or we will risk the time/resources of the next guy down the line.