How to properly handle disaster recovery to Azure cloud with vm, disk, and local file copy - file

I have a question, I hope you might be able to help me with. Following is a scenario that I would like to create using Azure. Could you let me know if this is feasible, and how I would go about doing this?
Scenario:
•Create a virtual machine--which I have done.
•Add an empty disk, format, create volume, etc.
•Now, I want to be able to have an area on this disk to which I will copy data from our local network. This would be done as a backup in the event of local infrastructure failure.
The idea behind this is to have a virtual machine that always has the latest copy of important local-premise files, along with installed applications allowing our users to remote into this vm in the event local services are disrupted.
I have the virtual machine, the storage container, the empty disk mounted, formatted, and available on my vm, but what do I do now to have an area (or the entire drive) made available for local on-premise file copy? I am evaluating CloudBerryLab's backup application, but when I use it I can only seem to send my file copies to a storage area that is not a disk drive--hence, not attached to the virtual machine.
So, am I not understanding how to handle this scenario properly? What tools should be used to make this happen, or is there a better architecture in Azure to handle this?
Thank You.

If you have a current version of Windows Server on-prem, you can just sign up for Azure Backup. Instead of keeping a VM running and paying for compute hours plus the storage, wait and spin it up only if you need it. The restore from Azure backup is amazingly quick to a VM.
Once you have a Windows Azure account, you will be able to see the Backup option. Set that up and then you can download the agent onto your Windows Server and set up the backup. SSL Cert is required.

Related

Regularly Transfer SQL Server to Azure SQL

I'm completely stymied. Let me describe my situation.
We're a relatively small company and the vast majority of our operational data is contained in a vendor database. Our vendor offers a Data Warehousing service. They've taken all of our data and applied some OLAP-ish modeling to it. Each day, they place either a .bak or a .diff file (.bak once a week, .diff every other day) in a FTP endpoint that we pay to access. Currently, we use a PowerShell script to download this data to a server that we've got sitting at a local server farm, where we then use SQL Server to "rehydrate it" by restoring from it.
That's all fine and good, but we really want to move as many of our workloads into the cloud as possible (we use Azure). As far as I can tell, SQL Managed Instances are the only way we can restore from a .bak file in the cloud. This is waaaay more expensive than we need, and we really don't need the managed instance platform at all except to restore from this file.
Basically, everything about this current process is diametrically opposed to us moving it to the cloud, unless we want to pay even more than we are to rent out this server farm.
I'm trying to lobby them for a different method of getting their data, but I'm having trouble coming up with a method to propose. We need to, every day, transfer a ~40gb database from SQL Server (at our vendor) to Azure SQL (in our cloud). What's the least-intrusive way we could do this?
We are glad that you choose the Azure SQL on Azure VM as the solution. Thanks for the suggestions of Alex and Davaid too:
I've actually seen all of those resources already. The biggest
obstacle here is that the entire process has to be automated
end-to-end, which makes bacpac restores more difficult (they'd have
to write some sort of .NET app to back up to bacpac). I think SQL on Azure VM is the only real option, so I may have
to look at cost for that.
If others face the same scenario, we could reference this. This also can be beneficial to other community members.

Transferring MDF/LDF files to target server

Background:
I have a medium-sized database (900GB) that needs to be copied onto another server (driven via code, not scheduled). Currently we take a backup (to .bak), move it to a staging server, and restore it to the target server. The target server does not have enough space to hold the backup file, and the restored instance simultaneously, thus the staging server. These transfers (backup to staging, restore from staging) happen over SMB2. The staging server needs to go away due to business requirements, however. It is worth mentioning the target server will be taken offline (and used offline) after the transfer, so I'm not sure the mirroring or replication options are valid.
I have identified two options -- one is to backup the database to the primary server, and open up firewall rules/smb to serve the backup file to the target server over SMB. ("RESTORE FROM \x.x.x.x\blah\db.bak"). Security isn't a fan, though.
The ideal solution (and one that could easily be implemented in every other database I've worked with), is to quiesce the database and transfer the datafiles (in the case of ms-sql, mdf and ldf files). However, upon research I see there is no such functionality available out of the box. I know I can take the database offline to copy the mdf/ldf safely, but that's not an acceptable solution (database must remain online).
I have read LOTS of posts and Microsoft documentation regarding VSS / shadow copy, but I have also read lots of conflicting information about the reliability of using VSS/sqlwriter to copy the mdf/ldf file to the target server, and simply re-attaching the database.
I am looking for documentation or advice (or even backup software that can be programmatically driven via an API) to accomplish this goal of transferring the database without requiring a secondary holding place. Currently I'm researching how to drive this copying process with Powershell, using VSS(vssadmin/vshadow from sdk), but I'm not confident in what I'm reading, and it's not even clear to me if VSS/sqlwriter is a supportable method to copying online LDF/MDF files. Any advice is appreciated.
Thanks,

Move from a local single-user database to an online multi-user database

I have a calendar-type WPF program that is used to assign the workload to a team. The events are stored in an Access database and the program is accessed by one person at a time by remotely connection to a computer. The team has grown and multiple people would need to access the program simultaneously. I can install the program on several computers, but where should I move the database? On a software like Dropbox/Onedrive, on a SQL online host? Thanks.
You can use a SQL Server on many Cloud platforms (though I am not sure Dropbox can host SQL Server natively). Azure (Microsoft cloud) is a very mature solution. You still should verify, now that multiple users will be managing data, that the database is backed up a regular basis and that any updates to data should be done within transactions that your code should be aware of. 'Aware of' means that if there is a conflict your code should either resubmit or notify the user that the insert/update/delete failed.

Backup PostgreSQL database hosted on AWS EC2 without shutting down or restarting the master

I'm using PostgreSQL v9.1 for my organization. The database is hosted in Amazon Web Services (EC2 instance) below a Django web-framework which performs tasks on the database (read/write data). The problem is, to backup this database in a periodic fashion in a specified format (see Requirements).
Requirements:
A standby server is available for backup purposes.
The master-db is to be backed up every hour. Once the hour is ticked, the db is quickly backed up in entirety and then copied to slave in a file-system archive.
Along with hourly backups, I need to perform a daily backup of the database at midnight and a weekly backup on midnight of every Sunday.
Weekly-backups will be the final backups of the db. All weekly-backups will be saved. Daily-backups of the last week will only be saved and Hourly-backups of the last day will only be saved.
But I have the following constraints too.
Live data comes into the server every day (rate of insertion is per 2 seconds).
The database now hosting critical customer data which implies that it cannot be turned off.
Usually, data stops coming into the db during nights, but there's a good chance that data might be coming into master-db during some nights for which I have no control over to stop the insertions (Customer-data will be lost)
If I use traditional backup mechanisms/software (example, barman), I've to configuring archiving mode in postgresql.conf and authenticate users in pg_hba.conf which implies I need a server-restart to turn it on which again, stops the incoming data for some minutes. This is not permitted (see above constraint).
Is there a clever way to backup the master-db for my needs? Is there a tool which can automate this job for me?
This is a very crucial requirement as data has begun to appear into the master-db since few days and I need to make sure there's replication of master-db on some standby-server all the time.
Use EBS snapshots
If, and only if, your entire database including pg_xlog, data, pg_clog, etc is on a single EBS volume, you can use EBS snapshots to do what you describe because they are (or claim to be) atomic. You can't do this if you stripe across multiple EBS volumes.
The general idea is:
Take an EBS snapshot using the EBS APIs using command line AWS tools or a scripting interface like the wonderful boto Python library.
Once the snapshot completes, use AWS API commands to create a volume from it and attach the volume your instance, or preferably to a separate instance, and then mount it.
On the EBS snapshot you will find a read-only copy of your database from the point in time you took the snapshot, as if your server crashed at that moment. PostgreSQL is crashsafe, so that's fine (unless you did something really stupid like set fsync=off in postgresql.conf). Copy the entire database structure to your final backup, e.g archive it to S3 or whatever.
Unmount, unlink, and destroy the volume containing the snapshot.
This is a terribly inefficient way to do what you want, but it will work.
It is vitally important that you regularly test your backups by restoring them to a temporary server and making sure they're accessible and contain the expected information. Automate this, then check manually anyway.
Can't use EBS snapshots?
If your volume is mapped via LVM, you can do the same thing at the LVM level in your Linux system. This works for the lvm-on-md-on-striped-ebs configuration. You use lvm snapshots instead of EBS, and can only do it on the main machine, but it's otherwise the same.
You can only do this if your entire DB is on one file system.
No LVM, can't use EBS?
You're going to have to restart the database. You do not need to restart it to change pg_hba.conf, a simple reload (pg_ctl reload, or SIGHUP the postmaster) is sufficient, but you do indeed have to restart to change the archive mode.
This is one of the many reasons why backups are not an optional extra, they're part of the setup you should be doing before you go live.
If you don't change the archive mode, you can't use PITR, pg_basebackup, WAL archiving, pgbarman, etc. You can use database dumps, and only database dumps.
So you've got to find a time to restart. Sorry. If your client applications aren't entirely stupid (i.e. they can handle waiting on a blocked tcp/ip connection), here's how I'd try to do it after doing lots of testing on a replica of my production setup:
Set up a PgBouncer instance
Start directing new connections to the PgBouncer instead of the main server
Once all connections are via pgbouncer, change postgresql.conf to set the desired archive mode. Make any other desired restart-only changes at the same time, see the configuration documentation for restart-only parameters.
Wait until there are no active connections
SIGSTOP pgbouncer, so it doesn't respond to new connection attempts
Check again and make sure nobody made a connection in the interim. If they did, SIGCONT pgbouncer, wait for it to finish, and repeat.
Restart PostgreSQL
Make sure I can connect manually with psql
SIGCONT pgbouncer
I'd rather explicitly set pgbouncer to a "hold all connections" mode, but I'm not sure it has one, and don't have time to look into it right now. I'm not at all certain that SIGSTOPing pgbouncer will achieve the desired effect, either; you must experiment on a replica of your production setup to ensure that this is the case.
Once you've restarted
Use WAL archiving and PITR, plus periodic pg_dump backups for extra assurance.
See:
WAL-E
PgBarman
... and of course, the backup chapter of the user manual, which explains your options in detail. Pay particular attention to the "SQL Dump" and "Continuous Archiving and Point-in-Time Recovery (PITR)" chapters.
PgBarman automates PITR option for you, including scheduling, and supports hooks for storing WAL and base backups in S3 instead of local storage. Alternately, WAL-E is a bit less automated, but is pre-integrated into S3. You can implement your retention policies with S3, or via barman.
(Remember that you can use retention policies in S3 to shove old backups into Glacier, too).
Reducing future pain
Outages happen.
Outages of single-machine setups on something as unreliable as Amazon EC2 happen a lot.
You must get failover and replication in place. This means that you must restart the server. If you do not do this, you will eventually have a major outage, and it will happen at the worst possible time. Get your HA setup sorted out now, not later, it's only going to get harder.
You should also ensure that your client applications can buffer writes without losing them. Relying on a remote database on an Internet host to be available all the time is stupid, and again, it will bite you unless you fix it.

Which single file embedded DB for a network project?

I am looking for an embedded database for a VB 2010 application working over the network. The database file is on a shared network folder on a NAS server (NTFS). For this reason I cannot use any server database like mysql, sql server, etc...
There are nearly 20 PCs accessing the shared folder on the network.
Each pc can open till 3 connections to the database, so we could have till 60 connections to the database. Mostly they just read the database, a writing to the database happens each 5-6 minutes and rarely at the same time, but it can happen.
In the past I had successfully used access+jet for such applications and never had problems, anyway with less network users.
I would still use access+jet (so I do not need to convert the whole database and code), but I would like to use something newer.
I have seen that SQLite is not jet right for network/shared enviroment.
SQL Compact is also not right for shared folder.
VistaDB is too expensive.
Firebird could be an option, but I have no experience: It should be used in a production system and I do not know if I could trust it.
Any suggestion? Or shell I stay by access?
Thanks for replying.
Go with firebird. Stable, lightweight, free and very fast as a network and as an embedded database. I am using it everywhere.
However the database cannot reside on a shared network folder. It must reside on a hard drive that is physically connected to the host machine.
VistaDB is good as an embedded database, but has awful performance as a network database because it is not true client-server.

Resources