Snowflake Strategy for PITR and backup archival? - snowflake-cloud-data-platform

I was curious on other's Snowflake backup archival/PITR strategy on how you keep your Snowflake database backups past the maximum 97 backup time (assumes Enterprise edition, 90 days for time-travel and 7 extra days for fail-safe) for a database's point-in-time-recovery (PITR)?
97 days is a lot of time for PITR but problems are not always caught until its too late in my experiences thus the need for backups past the max 97 days.
Example Scenario
The RPO is 12 months for a business, my initial train of thought to meet this RPO is:
Setup a task to create a clone of the database on day 97 (via maximum restore time supported by Snowflake)
The clone is created and the snapshot of the previous 97 days ensures PITR capability for the database (as time-travel is now being overwritten)
Another 97 days pass, the previous clone of the database is backed up "physically" to an external storage location (AWS/Azure/GCP) to archive the backup. Once the database backup is archived, the previous clone is dropped and another clone of the database is created to backup the previous 97 days (like step 2).
Repeat the process to ensure PITR and business as usual in case of DR
I might not 100% understand how time-travel/fail-safe "resets" when it reaches its maximum age for a database but i if understand correctly how time-travel resets, I am curious on others thoughts and different methods on how this would be achieved?
Also if there is a current snowflake best practice for this type of backup archival/PITR strategy, I'd be more than happy to hear it.

If you need to keep 12 months of historical data and need to be able to restore data using the Point in Time Recovery (PITR) approach,
then I see 2 alternatives, which actually can be implemented in parallel without affecting each other:
to take a snapshot of an ORIGINAL object using Zero Copy Clone every day (365 days a year; similar to taking daily a snapshot (files in S3) in Redshift). But there might be a limit to how many snapshots of ORIGINAL object could be taken over the time period (TBD in Question #2 above). So, in this case we would have 365 + 90 days of Time Travel (Customer controlled) + 7 days of Disaster recovery (Snowflake Admin controlled);
to backup daily SnowFlake data to S3 bucket - to use COPY INTO command.
I've confirmed with Snowflake You can backup the Original source as many times as you want using Zero Copy Clone. There is a soft limit of number of databases that can be created in an account, which is in the thousands.
Also note, Time Travel cannot be used on the CLONED object to restore data as it was on the original object before the cloned object was created. Though, you can use Time Travel on the CLONED object after it was created to keep tack of feature changes.

Related

Query about documentdb mannual snapshot

About DocumentDB manual snapshot AWS official documentation says Full backups — When a manual snapshot is taken, a full backup of your cluster's data is created and stored.
So it means, just as an example I have created 5 snapshots of a cluster ( 1 snapshot per day ) named ss_day_1, ss_day_2,ss_day_3,ss_day_4,ss_day_5. After that if I delete ss_day_1, ss_day_2,ss_day_3,ss_day_4 then I will be able to restore whole database ( which contains data from day1 to day 5 ) from ss_day_5 snapshot.
Am I right? or do I have to keep previous snapshots to restore the database?
It will be so kind of you if you clear my doubt. Thanks in advance.
When you restore from a manual snapshot, you restore to when the manual snapshot was taken, it will be the database image at that time. In your case, if you keep just ss_day_5, it will not contain the image of the data that was in ss_day_1.
If you need to keep 5 days of backups, it is better for you to use the automatic backup feature and choose a retention of 5 days. Automatic backups are continuous and incremental, there's a full snapshot at the beginning of the retention period and then only changes are backed up. This way it will use less backup space, lower cost for your cluster. And because these changes are streamed all the time, you can also do point in time restore (PITR) up to the exact second in the past (within the retention period), which you can't do with manual snapshots.
Check docs for the comparison between the two backup methods.

Creating database snapshots in Oracle 12

I have a lengthy daily process on an Oracle database that takes place every evening. I would like to:
Take a snapshot of the database at a certain point in the middle
of the daily process without interrupting it for a long time.
Query the snapshot to update a data warehouse database.
Drop the snapshot after pulling the necessary data.
I found the below link on Oracle website that describes what I need exactly and calls it a copy-on-write snapshot.
https://www.oracle.com/technetwork/database/features/availability/rman-fra-snapshot-322251.html
The problem is I could not find any help on creating such snapshots as all search results for "snaphsots" are related to materialized views which seemingly were called snaphosts in previous releases.
Is it possible to create a point in time version of a database in a short period of time (not backup / restore) in order to use it for data warehousing?

MSSQL backup strategy for Disaster Recovery

Our database is MSSQL and we are currently using High Availability Group with fail over in multiple node cluster so the idea of redundancy and backup is already there.
All of our current servers are in the same location; imagine a scenario where an earthquake takes out the entire hosting facility then we are sitting duck.
I'm exploring a disaster recover (DR) strategy to have another DR backup at a different location so when this happens I can bring back the entire database using the DR's backup set with a minimum down time and the data needs to be guarantee to be up to the minutes if possible.
I've read around the Microsoft doc but I don't really see one that talks about this in details.
I need a true backup that is up to the minutes, do I need to do this full backup (once every day) along with transaction log backup (one every minute) and then save it to the other different geological location? Can you point me to a guide or best practice documentation on how to achieve this?
I'm exploring a disaster recover (DR) strategy to have another DR backup at a different location so when this happens I can bring back the entire database using the DR's backup set with a minimum down time and the data needs to be guarantee to be up to the minutes if possible
Following are your options, since you already have Availability Groups in-place.
Multi sub-net WSFC You need to add additional node into WSFC, that work as DR Replica in availability group, consider it's another copy of secondary that you already have in same location but this copy laying different geographical location, since it's multi sub-net WSFC required attention for care-full Quorum configuration.
Log-Shipping it's simple solution compare to Multi sub-net WSFC and easy to administrate. It basically takes log backups on scheduled basis from Primary replica in you current Availability Group, and restores into secondary replica. You can have multiple secondary replicas and each of them different geographical locations depending on network bandwidth.
I need a true backup that is up to the minutes, do I need to do this full backup (once every day) along with transaction log backup (one every minute) and then save it to the other different geological location? Can you point me to a guide or best practice documentation on how to achieve this?
This post at DBA.SE would help you..

Copy database (all objects) to other database every 5 seconds

I have two databases - a CRM database (Microsoft Dynamics crm) and a company database.
These two databases are different.
How to copy the company database (all objects) into CRM database every 5 seconds?
Thanks
The cheapest way to do this (and one of the easiest) is to use a method called log shipping. This can (on a schedule even every 5 minutes or so) copy the log file to another machine and from the shipped log file restore to the target data base. Please ignore geniuses that will claim it can be done every minute because it takes a little while to close the log backup file, move it and reapply but a 5-10 minute window is achievable.
You can also use mirroring,transactional replication, and other High Availability solutions but there is no easy way to keep two machines in sync.
Do you need to duplicate the data? Can;t you query the source system directly if they're on the same server?
Else this might point you in the right direction: Keep two databases synchronized with timestamp / rowversion

SQL Server backup only those stored procedures whose object definition is modified

I am having a scenario where I need to create a backup of database which contains huge data in GBs. Once the full backup is done I am trying to optimize it using partial backup or backing up only those SP's whose object definition is modified.
One way I can think of is comparing through Object definition date, say past 7 days.
Can you please let me know better way which I can achieve this?
You do not back up databases that way. You back up the data in the database first and foremost. Objects are all backed up, you can't choose not to back up one table either. You do a full back up on a schedule (like once a week) and then differential backups nightly and then transaction log backups roughly every 15 minutes. Frankly the fact that you are asking this question tells me your company needs to hire a dba to protect its data.
Next, stored procs shoudl be in source control like any other code. You can tell what the current version is the same way you tell the current version of any code. If you need to restore only one, you can do it from teh source control repository. This does require that you have procedures that do not permit developers to push code to other servers beyond dev and the build team or managers who do have the rights will only push from the source controlled version.
Before optimizing anything in backups, you should really know what are your Recovery Point Objective and Recovery Time Objective -- meaning basically how long your system can be down and how much data you can lose. That's what you should use then to plan your backups.

Resources