AWS RDS Backups - database

AWS RDS Backups - database

So I recently started using AWS and Elastic Beanstalk with RDS.
I wonder whats the best practices for creating database backups?
So far my setup is this.
Enable automatic backups.
bash script that creates manual snapshots everyday and removes manual snapshots older than 8 days.
bash script that creates a sql dump of the database and uploads it to S3.
Reason why I am creating the manual snapshots is if I were to delete the database by misstake, then I still have snapshots.
The bash scripts is on an EC2 instance launched with a IAM role which is allowed to execute these scripts.
Am I on the right track here?
I really appreciate answers, thanks.

A bit of context...
That automated backups are not saved after DB deletion is a very important technical gotcha. I've seen it catch devs on the team unawares, so thanks for bringing this up.
After the DB instance is deleted, RDS retains this final DB snapshot and all other manual DB snapshots indefinitely. However, all automated backups are deleted and cannot be recovered when you delete a DB instance. source
I suspect for most folks, the final snapshot is sufficient.
Onto the question at hand...
Yes. 110%. Absolutely.
I wouldn't create manual snapshots; rather, copy the automated ones.
Option 1: You already have the automated snapshots available. Why not just copy the automated snapshot (less unnecessary DB load; though, admittedly less of an issue if you're multi-AZ since you'll be snapshoting from the replica), which created a manual snapshot. I'd automate this using the aws sdk and a cron job.
Option 2: Requires manual adherence. Simply copy your automated snapshots (to create manual snapshots) before terminating a DB.
Unclear why you'd need the s3 dump if you have the snapshots.
For schema changes?: If you're doing it for schema changes, these should really be handled with migrations (we use knex.js, but pick your poison). If that's a bridge too far, remember that there's an option for schema-only dumps (pg_dump --schema-only). Much more manageable.
Getting at the data?: Your snapshots are already on s3, see faq. You can always load a snapshot, and sql dump it if you choose. I don't see an immediately obvious reason for purposely duplicating your data.

Related

Entity Framework migration strategy for multiple instances

I have a .NET core app which I'am running on AWS Elastic Container Services (ECS).
- The app runs on two different instances.
- Database is SQL server
The app runs the database migrations on startup, which has worked really well. But then i had to migrate a lot of data which meant that the migration took longer time. This resulted in duplicates of the data being moved.
This happens because both apps first checks the database if the migration has been executed, both finds out that it hasn't, then both starts running the migration which takes time. After it is done it adds the migration to the database.
How do people solve this?
Possible solutions I and others have thought of
Start with only one instance of the app, then scale up.
this would work, but then I will have to manually scale down and up for each time there is a migration. (It is possible to do it automatically, but it would take time)
Wrap long running migrations in transactions and at the start set the migration as done in the database. Check if it is in the database before commiting the change. If the transaction fails, remove the migration from the database.
Lock the database? EF Core lock the database during migration . Seems weird.
Make the migration a part of the deployment process. This seems to be best practice, but it would mean that the Build server would need to know the Database secrets. I'am not to afraid to give it, but it would mean i would have to maintain a duplicate set.
What does people out there do? Am I missing some obvious solution?
Thanks

We also used to have our applications perform the migration, but even Microsoft recommends avoiding this in a multi-instance environment:
We recommend production apps should not call Database.Migrate at application startup. Migrate shouldn't be called from an app in server farm. For example, if the app has been cloud deployed with scale-out (multiple instances of the app are running).
Database migration should be done as part of deployment, and in a controlled way.
Like everything there are different ways to go about solving the problem. Our team is small and so we generate migration scripts through the EF CLI tooling and then run them manually as part of a deployment/maintenance routine. This could of course be automated if your process warrants it.

copy Azure SQL database (PaaS) to IaaS (SQL server on VM)

Is it possible to use Create Database [] as copy of [] to create a copy of database that is hosted as Azure SQL database (PaaS) towards IaaS (SQL server on VM)?
Can you recommend an alternative of Import/Export that can limit the downtime of such transition?
Reason for this migration is the restriction of cross databases queries in PaaS mode that complicate one-time migration towards new database used in newer application version process

The answer depends on whether you want to copy database schema, data, or both.
As Jaxidian said, ApexSQL tools can do the job but as far I know DataDiff will only synchronize database data, while Diff will synchronize schema.
Here is the article describing processes of copying database data:
https://solutioncenter.apexsql.com/how-to-automatically-synchronize-the-data-in-two-sql-server-databases-on-a-schedule/
If you want to copy both schema and data, process is described here:
https://solutioncenter.apexsql.com/how-to-automatically-compare-and-synchronize-multiple-databases-on-different-sql-server-instances/

There are lots of tools available that can accomplish this. Which one is best for you depends on your needs. However, the "Copy" feature in the Azure Portal will not accomplish this for you but can be a partial solution to the approach you finalize on.
I'll make the following assumptions:
You have an always-on 24/7 production load so there are no regularly/nightly/weekly/monthly maintenance windows
You can schedule a maintenance window but you wish to keep it as small as possible
You can easily configure your applications' connectionstrings
Your database isn't huge. Gigabytes is fine.
Your database is mostly static data (i.e. an incremental approach is much faster than a dump-and-fill)
If I were to do this today/right now, my approach would be like this (this is only one option):
Use the Copy feature to make a copy of the database that I can use this as a staging area/reference point while minimizing the load on the Production database
Create a backup (bacpac file) from the copied database
Restore the bacpac file onto your IaaS-hosted SQL Server to form your base deployment
Start your maintenance window and effectively put your database into read-only mode so the data is now no longer changing (lots of strategies on how to do this whether you turn applications off, revoke permissions, etc.)
Use a tool such as ApexSQL Data Diff (Redgate and others have options) to compare data between the two databases and sync the data over to the new IaaS DB. Be careful - depending on your data needs you may have to tweak the generated scripts that sync the data.
Verify that the new DB is now indeed a duplicate copy of your old DB (ApexSQL Data Diff can also help with this - several options exist here)
Change connectionstrings on your apps to point to the new DB
Turn applications back on and end your maintenance window.
So of course, if you do something like this, practice it numerous times and test the results numerous times well before your maintenance window. Get a good idea of the timing for everything, especially how long it will take for you to generate and restore the bacpac file. This is because you want to do that as late as possible before your maintenance window to minimize the time it takes to generate and run the final "Data Diff" script that you'll use. The longer that script takes, the longer your outage will be.

Are Distributed Transactions a good idea for enabling rollback of database upgrades in Windows Installer Custom Actions?

I've outgrown the Sql Server custom actions available in WiX, so I'm taking the bold step of creating my own using Deployment Tools Foundation. I want to be a good citizen and make sure that mine support rollback. But what's the best way of doing it?
I need to support SQL Server 2005 and later, all editions.
The problem, as I see it, is that Windows Installer works in two phases: it does the work, storing undo information as it goes. Then, when all the pieces are in place it either commits (deleting the undo information) or does a rollback.
This means that standard transactions won't do the job. They would have to be completed inside my Execute custom action, and I wouldn't get a chance to roll them back later.
I've considered taking a copy-only backup of the database that I can restore in the rollback action if necessary but I think this approach, whilst simple has shortcomings. I don't know how big our databases will get, for example - so I can't guarantee that there will be space available to hold the backup on the target machine. Also, backup and restore can take a while to complete, and I don't want typical installs (where rollback doesn't happen) to be unnecessarily slow.
So that brings me to my current favoured idea: make sure the Distributed Transaction Coordinator is started up, then initialise a Distributed Transaction before making changes, then either committing it or rolling it back in the appropriate custom actions.
It seems I can uses the members of the TransactionInterop class to export a cookie that will enable me to share the transaction between my different custom actions.
Can anyone with experience of this kind of thing say if it is likely to work?

Some database/instance operations cannot be done inside a transaction (eg. CREATE/ALTER/DROP ENDPOINT), and other operations cannot be done inside a distributed transaction (eg. SAVE TRANSACTION). So you won't be able to do them at all in your proposed plan. Also your DB upgrade scripts will have to all work correctly when run inside an uncommitted transaction.
I would say that there are fewer risks of going down the backup/restore path (or alternatively creating a database snapshot and restoring from the snapshot on rollback, with the drawback of requiring EE).
Also an option is to have an undo script for every do script run during upgrade, and have the undo script run during rollback and remove the effects of the installation. I understand that this is a hard problem, probably doubles the amount of scripts that have to be developed (and tested...) and requires some serious developer discipline.

I've done quite a few installers with SQL scripts over the years and I've kind of come to the opinion that it's only suited for simple databases like here's my VB app with a local MSDE / MySQL database or here's my local store for code table lookups and temporary commits while we wait to sync it somewhere else.
Once you get into industrial strength heavy lifting enterprise app type situations I like to get my DB configuration out of the installer and into the application as a first run type story. You can do a lot heavier lifting with C# there and not be constrained by MSI.

Replicating / Cloning data from one MS SQL Server to another

I am trying to get the content of one MSSQL database to a second MSSQL database. There is no conflict management required, no schema updating. It is just a plain copy and replace data. The data of the destination database would be overwritten, in case somebody would have had changed something there.
Obviously, there are many ways to do that
SQL Server Replication: Well established, but using old protocols. Besides that, a lot of developers keep telling me that the devil is in the details and the replication might not always work as expected and that is this best choice for an administrator, but not for a developer.
MS Sync Framework: MSF is said to be the cool, new technology. Yes, it is this new stuff, you love to get, because it sounds so innovative. There is the generic approach for synchronisation, this sounds like: Learn one technology and how to integrate data source, you will never have to learn how to develop syncing again. But on the other hand, you can read that the main usage scenario seems to be to synchronize MSSQL Compact databases with MSSQL.
SQL Server Integration Services: This sounds like an emergency plannable solution. In case the firewall is not working, we have a package that can be executed again and again... until the firewall drops down or the authentication is fixed.
Brute Force copy and replace of database files: Probably not the best choice.
Of course, when looking on the Microsoft websites, I read that every technology (apart from brute force of course) is said to be a solid solution that can be applied in many scenarios. But that is, of course, not the stuff I wanted to hear.
So what is your opinion about this? Which technology would you suggest.
Thank you!
Stefan

The easiest mechanism is log shipping. The primary server can put the log backups on any UNC path, and then you can use any file sync tools to manage getting the logs from one server to another. The subscriber just automatically restores any transaction log backups it finds in its local folder. This automatically handles not just data, but schema changes too.
The subscriber will be read-only, but that's exactly what you want - otherwise, if someone can update records on the subscriber, you're going to be in a world of hurt.

I'd add two techniques to your list.
Write T-SQL scripts to INSERT...SELECT the data directly
Create a full backup of the database and restore it onto the new server
If it's a big database and you're not going to be doing this too often, then I'd go for the backup and restore option. It does most of the work for you and is guaranteed to copy all the objects.
I've not heard of anyone using Sync Framework, so I'd be interested to hear if anyone has used it successfully.

Warm Standby SQL Server/Web Server

Warm Standby SQL Server/Web Server
This question might fall into the IT category but as the lead developer I must come up with a solution to integration as well as pure software issues.
We just moved our web server off site and now I would like to keep a warm standby of both the website and database (SQL 2005) on site.
What are the best practices for going about doing this with the following environment?
Offsite: our website and database (SQL 2005) are on one Windows 2003 server. There is a firewall in front of the server which makes
replication or database mirroring not an option. A vpn is also not an option.
My thoughts as a developer were to write a service which runs at the remote site to zip up and ftp the database backup to an ftp server
on site. Then another process would unzip the backup and restore it to the warm standby database here.
I assume I could do this with the web site as well.
I would be open to all ideas including third party solutions.

If you want a remote standby you probably want to look into a log shipping solution.
This article may help you out. In the past I had to develop one of these solutions for the exact same problem, writing it from scratch is not too hard. The advantage you get with log shipping is that you have the ability to restore to any point in time and you avoid shipping these big heavy full backups around and instead ship light transaction log backups, and occasionally a big backup.
You have to keep in mind that transaction log backups are useless without having both the entire sequence of transaction log backups and a full backup.

You have exactly the right idea. You could maybe write a script that would insert the names of the files that you moved into a table that your warm server could read. Your job could then just poll this table at intervals and not worry about timing.
Forget about that - just found this. Sounds like what you are setting out to do.
http://www.sqlservercentral.com/articles/Administering/customlogshipping/1201/

I've heard good things about Syncback:
http://www.2brightsparks.com/syncback/sbpro-features.html

Thanks for the link to the article sambo99. Transaction log shipping was my original idea, but I dismissed it because the servers are not in the same domain not even in the same time zone. My only method of moving the files from point A to point B is via FTP. I am going to experiment with just shipping the transaction logs. And see if there is a way to fire off a restore job at given intervals.

www.FolderShare.com is a good tool from Microsoft. You could log ship to a local directory and then synchronize the directory to any other machine. You could also syncrhronize the website folders as well.
"Set it and forget it" type solution. Setup your SQL jobs to clear older files and you'll never have to edit anything after the initial install.
FolderShare (free, in beta) is currently limited to 10,000 files per library.

For all interested the following question also ties into my overall plan to implement log shipping:
SQL Server sp_cmdshell

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight