I am working with a data warehouse with SQL Server 2012 and was wondering what would be the most optimized, automated procedure for a backup/restore strategy.
Current observations and limitations:
1) Cannot use transaction logs as it would affect my load performance - datasets are potentially huge with large transactions
2) Current plan is to do full backup every week and differential backup every day
I am not sure when DML operations will happen as it depends on my application's usage, but is there a way to just track the NUMBER of changes to a database that would trigger a differential backup? A way that would not affect performance? I do not want to be taking unnecessary differential backups.
Would Change tracking be a good solution for my scenario? Or would there be overhead involved? I do not need to know the actual data that was changed, just the fact that it was changed by a certain amount.
Thanks in advance!
Well, there's this ( http://www.sqlskills.com/blogs/paul/new-script-how-much-of-the-database-has-changed-since-the-last-full-backup/ ). I'm just trying to figure out what problem you're trying to solve. That is, if you find that the size is below some threshold, it will be (by definition) cheap to do.
It all depends on your DWH configuration.
1. Is your DWH database partitioned? If yes, It would be easier to do the daily db backup(diff backup) for the current partition ONLY. It's much more smaller set of data to be backed up.
If not, Current plan is to do full backup every week and differential backup every day is the only way since you cannot use transaction log file.
You could also try 3rd party disk (block) level backup software (i.e. Doubletake)....
Hope it helps.
You seem to have a mistaken notion of what a differential backup is. Don't worry; it's common.
When you say things like "track the number of changes to a database that would trigger a differential backup", it implies that you think that a differential backup gets all of the changes since the latest full or differential.
However, a differential backup gets all of the data that has changed since the last full backup only. So, you'd expect the size of subsequent differential backups to get larger and larger. For example, let's say you take a full backup on Sunday and a differential backup every other day. You'd get something like:
Monday: All of the data changed since Sunday's backup.
Tuesday: All of the data changed since Sunday's backup (including Monday's data)
Wednesday: All of the data changed since Sunday's backup (including Tuesday's data)
etc
Additionally, you'd only ever restore at most one differential backup if/when you need to restore your database. For instance, if your database crashed on right before Thursday's backup, you'd restore your last full backup (from Sunday in my example), then Wednesday's differential, and you're done.
As for when to schedule it, that's typically dictated by the rhythm of your business. For instance, you might decide to take a backup just before you kick off your ETL or just after. Doing it during doesn't make much sense as you'd have an inconsistent (with respect to your ETL process) database if you ever need to restore it.
Related
I've recently taken on the database administration of a few SQL servers varying from SQL Server 2005 to 2014, where many of the DB's are in Full recovery mode, however no good ongoing backup maintenance plans were ever setup.
It seems to me that the previous DBA would only deal with transaction log files when they got out of control and fill up the hard drive. So i'd like to change this and fix the issue once and for all. I've been doing some reading and think I have a decent understanding of what need to be done, so i'd like to validate my understanding as well as ask a few question to clarify a few points that I still don't fully get.
So based on my understanding to date i would need to create a maintenance plan which starts with a Full Backup. I still need to talk to management to figure out things like RTO, acceptable data loss, etc so let's assume for this example that we'll do Full Backup's on Sunday.
Next I would add to this maintenance plan Differential backups every night... so Monday to Saturday. I realize that this could also be Full backup's or run the differentials more frequently, but again this is just as an example to make sure i'm understanding things correctly.
Now as for the transaction log backups. I get that i would need to back these up and truncate the log file to prevent it from continually growing and getting out of control. I don't know if there are any specific recommendation for how often to back this up, but i've seen 15 minutes suggested. I guess this would more so fall under the acceptable data loss window. Is that correct?
So the other thing that i've discovered is that when you backup the transaction log file with truncation, that if the log file has already grown out of control that it doesn't shrink the file. I've also read that it isn't good to shrink these files, at least on regular basis because once you shrink it, it would need to grow again and this will cause fragmentation and performance issues.
Now since I am currently in a situation that the files have already grown out of control, i'd assume that i should in fact shrink the log files this one time once i've but my maintenance in place. Is that assumption correct?
Also once i Shrink the transaction log file this one time, are there any maintenance task that i should be running to avoid performance issue due to shrinking the log files?
One other question that i was wondering about would be with respect to point in time recovery. So let's say I take a full backup at 5:00 AM and i also take a transaction log backup every 15 minutes. I get alerted that at 6:18 AM something has gone wrong (let's say a table was deleted). So i know i can restore by Full backup that happened at 5:00 AM and leave it in NO RECOVERY mode and restore all of the Transaction log backup from 5:15 AM to 6:15 AM, but here is what i'm interested in...since i have my DB in full recovery mode, is it possible to somehow use my existing transaction log file (not the backups) to roll forward all transaction between 6:15 and 6:17 just before the table was deleted? If so how would you do this? I guess this obviously wouldn't work in the case of you loosing the hard drive with your transaction log files, or your server exploding...but in a case like i've outlined is it do able?
Thanks
I would recommend doing a full backup after everyone stopped working, e. g. at 10 p.m. (if that is the case), not in the morning shortly before people start working. Just in order to give it enough time to run.
Personally, I prefer doing daily full backups instead of incremental backups if the database is not too big to save backups for, say, 14 days. I feel better to rely on less files. If database and full backups are too big, incremental backups might be the better choice.
As you said: How many transaction log backups you create during the day depends on the acceptable data loss window. In an environment > 5 people working on the system (just as a gut feeling) I would configure them to run all 15 minutes, on very big systems maybe even more.
After the first transaction log backup, you might want to shrink the LOG file ONCE.
I think it's not necessary to run any optimizations after a log shrink.
As far as I know it's not possible to restore transactions between 06:15 and 06:17.
When activating the transaction log backups, keep in mind that the first transaction log backup will be quite big (around the size of the current, large log). Ensure to have enough space on disk until you shrink the log file and delete the first transaction log (usually done automatically within the maintenance plan, e. g. after 14 days.).
I have been building an application to Backup SQL Databases using VDI using SMO.
I have a doubt whether i should be capturing the last backup time of a database along with the LSN or simply, just capture the LSN.
More precisely, i wanted to take Last Backup Time to know that if end-user is using another software to backup their SQL DB's using Copy-only method since copy-only method doesn't update LSN or truncate anything but does update the Last backup time.
So, if i am ignoring the last backup time information of any DB and simply capturing the LSN, is it going to cause any problems for me? in terms of recovery/backup if there are other 3rd party softwares into play?
:::: Edit :::::
I am a backup application developer and i would like to know, should i be capturing the Last Backup time of the database that my end-user would like to backup ? or just simply capture the LSN to maintain the Log chain?
If you're writing your backup software for the purposes of ensuring recoverability (that is you're not taking copy-only backups yourself), don't worry about any copy-only backups that have been taken by others. As you've said, copy-only backups don't affect the backup chain.
The only thing that would mess you up is if someone else is taking a non-copy-only backup interleaved with yours. In that case, you could get into a situation where one of their backups relies on one of yours (or vice versa) and the restore scenario is complicated at a point where it really shouldn't be (i.e. the database is down and you need to recover it).
I have a nightly job that does a bunch of inserts. Since I have a full recovery model, this increases my transaction log size.
Currently I have my log file big enough to accommodate these transactions, but the issue is that the transaction log is mostly empty throughout the day.
Is it an issue (besides disk space) to have a huge (mostly empty) transaction log?
I'm thinking about switching the database to simple recovery before the job, running the job and then switching it back to full recovery. I can have the transaction logs just not be backed up until our nightly differential backup comes around and then i can start the transaction log backup again.
Suggestions?
I'd do nothing. Or have SIMPLE permanently.
Changing the recovery model back to full will require a full backup anyway to preserve integrity later on. You'll have a gap in your LSN chains otherwise.
You've mentioned differential backup, so I assume your full is not each night.
So, putting this together means you'll use more disk space for your full backup than for the LDF file.
I'd keep it FULL and, if push comes to shove, force log backups during the nightly job to keep the log size small. Also, is OK to shrink the log if you really have to, ie. it grew once due to a one time operation and it won't grow back. The fragmentation concerns are more valid for data files, log files have a totally different structure and allocation pattern. Just don't get into the habit of constant shrinking.
I'm saying this because if you do have a backup restore strategy in place already it seems silly to me to increase the window of data loss from 'last log backup' to 'last differential backup'. We're talking a change from 10-30 minutes (typical log backup frequency) to 3-24 hours (typical differential frequency). You won't be able to do differentials as often as log backups because differentials are growing in size (starting with second differential after full each differential is at least as big as previous differential). Log backups only backup the log since last backup, so they stay relative constant in size. Also with a SIMPLE mode you won't be able to do a log tail backup attempt and recover all data in case of crash.
It just seems you're trading off a lot more that you gain with reducing the log file size.
I have a SQL Server 2005 database that is backed up nightly. There backup consists of:
FULL backup of the database.
backup of the transaction log.
These are currently two separate jobs.
The log is huge and I'd like to set things up so that:
the database is backed up in full nightly
the log is set such that I can recover the database from any point between one backup and the next.
How can I set this up so that the log files are manageable? I suspect that the log has never been shrunk, as the log is huge.
You are currently implementing the FULL Recovery Model from the sound of things. This will allow you to restore to a point in time provided that you have a transaction log backup that covers the desired point in time (post full backup).
In order to reduce the size of your required transaction log file, you should look to increase the frequency of your transaction log backups. I would suggest hourly. Once you have gauged the actual usage of your log file, you can then look to shrink it to a more suitable size. The key point to note here is that once a transaction log backup has been completed, the inactive portion of the log file becomes available for use once again. The reason why a transaction log file grows continuously is if the transaction log backups are either, not being taken at all or their frequency is not sufficient.
I would also suggest that you consider performing a mix of DIFFERENTIAL and FULL Backups in order to reduce the collective size of your backed up data. An example schedule would be a weekly FULL Backup, say every Sunday, with daily DIFFERENTIAL backups.
I hope what I have detailed makes sense. Please feel free to contact me directly and I will happily assist you in deploying an appropriate backup strategy for your environment.
Essential References:
How to stop the transaction log
file from growing enexpectedly
Backup and Restoring Databases in
SQL Server
One of the things I find with backups is that people typically don't run them frequently enough - especially when it comes to log file backups. And it sounds like you're correct, that the log file isn't being truncated regularly (which means you're likely wasting premium disk space [1]). More importantly though, that's leaving you completely exposed from a recoverability standpoint.)
Happily though, getting things up and running as you need them isn't so hard. In fact, I'd recommend the following three videos as they should give you the background info you need, and then the step-by-step instructions you'll want to follow to get everything working correctly:
http://www.sqlservervideos.com/video/logging-essentials
http://www.sqlservervideos.com/video/sql2528-log-files
http://www.sqlservervideos.com/video/sqlbackup-best-practices
1 Maximize Storage Performance: http://www.sqlmag.com/Article/ArticleID/100893/sql_server_100893.html
What you are doing is effectively a SIMPLE mode backup with bonus disadvantage of not shrinking the log. There is no point to back up both at the same time. If you're doing a full backup, you can just truncate the log.
If you're going to be able to restore to any point of time, you will have to do a full backup once a day (say) and back up the log few times during the day. See http://msdn.microsoft.com/en-us/library/ms191429(SQL.90).aspx
I want to implement an automated backup system for my site's SQL Server 2005 DB that will backup nightly to Amazon's S3 service. But since S3 charges both for space and bandwidth used, I would like to minimize the size of the files that I transfer in. What is the best way to achieve this?
I should clarify that I'm not really talking so much about compression, which is pretty straightforward, but concerning backup strategies like whether to do differential backups all the time, whether I need to copy transaction logs, etc.
Differential backups will be smaller than full backups, of course. However, you should consider the restoration side as well. You'll need your last full backup as well as your differentials to perform the restore which can add up to a lot of bandwidth/transfer time for a restore. One option is to perform a full backup weekly and do differentials daily (or a similar type of schedule).
As for transaction logs, it depends on what granularity you're looking for in restoring your data. If restoring to the last full or differential backup is sufficient, then you don't need to worry about taking transaction log backups. If that's not the case, then transaction log backups will be necessary.
Either use a commercial product do compress the backups like Red Gate Backup Pro or just zip-compress it after you're done.
Write a .batch script or powershell script that will find the file/s created in the past day and zip them up. Then FTP or whatever you have to do.
A powershell example that I just came across.