I want to implement an automated backup system for my site's SQL Server 2005 DB that will backup nightly to Amazon's S3 service. But since S3 charges both for space and bandwidth used, I would like to minimize the size of the files that I transfer in. What is the best way to achieve this?
I should clarify that I'm not really talking so much about compression, which is pretty straightforward, but concerning backup strategies like whether to do differential backups all the time, whether I need to copy transaction logs, etc.
Differential backups will be smaller than full backups, of course. However, you should consider the restoration side as well. You'll need your last full backup as well as your differentials to perform the restore which can add up to a lot of bandwidth/transfer time for a restore. One option is to perform a full backup weekly and do differentials daily (or a similar type of schedule).
As for transaction logs, it depends on what granularity you're looking for in restoring your data. If restoring to the last full or differential backup is sufficient, then you don't need to worry about taking transaction log backups. If that's not the case, then transaction log backups will be necessary.
Either use a commercial product do compress the backups like Red Gate Backup Pro or just zip-compress it after you're done.
Write a .batch script or powershell script that will find the file/s created in the past day and zip them up. Then FTP or whatever you have to do.
A powershell example that I just came across.
Related
I'm using Microsoft SQL server and storing lot of data daily. I'm taking full backup daily which takes more than 5 hours to complete. Is there any idea to complete my backup process within a hour ? An alternative things to do?
Daily differential backup (only changes will be backed up) and CloudBerry software will make you deal:)
Definitely consider taking differential backups. Depending on how much data is changing you may be able to take differential backups every hour or so, giving you a quick way to restore the hourly backup if needed. Perhaps complete a full backup once a week or every few days.
There are several considerations and I suggest you research it well to see if it will meet your specific needs, for example (source):
If your databases are small or compress effectively enough so that your full and transaction log backups fall within storage and SLA limits, differential backups are unnecessary.
If your databases change a lot between backups, you might as well perform full backups.
If the changes to your database are few and the transaction log backups would take longer to restore than the differential backups, using differential backups might make sense and are worth investigating.
Other things to consider:
Are you currently taking transaction log backups? Should you be? What recovery model are you using?
If you decide to do differential backups, be aware of things such as the "Copy only backup" option in SQL Server Management Studio. These have implications for restoring data in a disaster recovery situation.
In essence, you should educate yourself on SQL Server backup and restore before you make any changes.
To improve the situation and to speed the backup you may consider differential and transaction log backups. You can do it through the studio, but to schedule it you will have to refer to a 3-rd party tool like CloudBerry.
The alternate to taking full backups daily is taking incremental/ progressive incremental or reverse incremental backups.
An incremental backup copies the changes only from the last successful backups.
Progressive incremental has only on one full backup copy for the servers and takes incremental backups for the rest of cycle.
Reverse incremental supported by Veeam and DPM has its own mechanism to backup and store only incremental copies from the server, and store each copy as a full copy which allows quick restoration in case of disasters.
To configure this kind of backups, you can go for any enterprise level backup tool, but if you focus on cost effectiveness I would recommend Cloudberry backup.
I tried CloudBerry Cross-Platform Cloud Backup which provides a simple GUI to manage backup and restores and cloud storage account comes bundled with the software.You can also use the cloudberry explorer to view, move and manage your data on the cloud storage account.
It also provides enterprise backup features like job scheduling, CLI, compression and encryption.
I have SQL Server 2012, and a Backup drive apart, but it is really full. I want to move all the backups, after done, to a Network Drive. My concern is with the differential and log backups, as I don't know if they need to see the full backup to be done, or how does it works in SQL Server.
Is it safe to move the Full Backup to another drive, and then execute the daily differential backups?
Let us remember the golden rule for restoring first.
‘After restoring full database backup, restore latest differential database backup and all the transaction log backup after that to get database to current state.’
So if you have only one full backup and then keep on taking differential backup's the size of the differential backup will keep on growing everyday. In case of disaster you may have to find the older full backup and then restore the latest differential backup.
So in theory: If you take one full backup and move to somewhere else and keep on taking differential backup works.
However, there are changes that your full backup may get corrupted or your differential backup have issues. You will in that case miss another full backup.
It is always a good idea to take full backup at regular intervals. I take full backup of my databases every day when they are relatively small and every other day when they are very large.
Here is the blog post where I have written about backup timeline.
https://blog.sqlauthority.com/2009/07/14/sql-server-backup-timeline-and-understanding-of-database-restore-process-in-full-recovery-model/
To answer the question you asked, the SQL Server doesn't access the full backup itself when it makes differential and log backups. It instead keeps track of what needs to be backed up in the database itself.
The short version of that for differential backups is "the notion of what's changed (and therefore needs to be backed up) is kept track of in what are called differential change map pages" and for log backup is "the notion of what's changed (and therefore needs to be backed up) is kept track of in the transaction log itself".
The longer (and IMO more interesting) version is this excellent article by Paul Randal.
To recover, you need to restore the full backup before the differential. The normal recovery sequence is to restore the latest full backup, latest differential backup after the full backup, and log backups since the differential in sequence.
The backup files obviously need to be available in order to restore regardless of where you store them. If you were to lose the local full backup, you could not restore at all. You might consider copying the full backup to the network share as an extra layer of safety.
I have been building an application to Backup SQL Databases using VDI using SMO.
I have a doubt whether i should be capturing the last backup time of a database along with the LSN or simply, just capture the LSN.
More precisely, i wanted to take Last Backup Time to know that if end-user is using another software to backup their SQL DB's using Copy-only method since copy-only method doesn't update LSN or truncate anything but does update the Last backup time.
So, if i am ignoring the last backup time information of any DB and simply capturing the LSN, is it going to cause any problems for me? in terms of recovery/backup if there are other 3rd party softwares into play?
:::: Edit :::::
I am a backup application developer and i would like to know, should i be capturing the Last Backup time of the database that my end-user would like to backup ? or just simply capture the LSN to maintain the Log chain?
If you're writing your backup software for the purposes of ensuring recoverability (that is you're not taking copy-only backups yourself), don't worry about any copy-only backups that have been taken by others. As you've said, copy-only backups don't affect the backup chain.
The only thing that would mess you up is if someone else is taking a non-copy-only backup interleaved with yours. In that case, you could get into a situation where one of their backups relies on one of yours (or vice versa) and the restore scenario is complicated at a point where it really shouldn't be (i.e. the database is down and you need to recover it).
I have a SQL Server 2005 database that is backed up nightly. There backup consists of:
FULL backup of the database.
backup of the transaction log.
These are currently two separate jobs.
The log is huge and I'd like to set things up so that:
the database is backed up in full nightly
the log is set such that I can recover the database from any point between one backup and the next.
How can I set this up so that the log files are manageable? I suspect that the log has never been shrunk, as the log is huge.
You are currently implementing the FULL Recovery Model from the sound of things. This will allow you to restore to a point in time provided that you have a transaction log backup that covers the desired point in time (post full backup).
In order to reduce the size of your required transaction log file, you should look to increase the frequency of your transaction log backups. I would suggest hourly. Once you have gauged the actual usage of your log file, you can then look to shrink it to a more suitable size. The key point to note here is that once a transaction log backup has been completed, the inactive portion of the log file becomes available for use once again. The reason why a transaction log file grows continuously is if the transaction log backups are either, not being taken at all or their frequency is not sufficient.
I would also suggest that you consider performing a mix of DIFFERENTIAL and FULL Backups in order to reduce the collective size of your backed up data. An example schedule would be a weekly FULL Backup, say every Sunday, with daily DIFFERENTIAL backups.
I hope what I have detailed makes sense. Please feel free to contact me directly and I will happily assist you in deploying an appropriate backup strategy for your environment.
Essential References:
How to stop the transaction log
file from growing enexpectedly
Backup and Restoring Databases in
SQL Server
One of the things I find with backups is that people typically don't run them frequently enough - especially when it comes to log file backups. And it sounds like you're correct, that the log file isn't being truncated regularly (which means you're likely wasting premium disk space [1]). More importantly though, that's leaving you completely exposed from a recoverability standpoint.)
Happily though, getting things up and running as you need them isn't so hard. In fact, I'd recommend the following three videos as they should give you the background info you need, and then the step-by-step instructions you'll want to follow to get everything working correctly:
http://www.sqlservervideos.com/video/logging-essentials
http://www.sqlservervideos.com/video/sql2528-log-files
http://www.sqlservervideos.com/video/sqlbackup-best-practices
1 Maximize Storage Performance: http://www.sqlmag.com/Article/ArticleID/100893/sql_server_100893.html
What you are doing is effectively a SIMPLE mode backup with bonus disadvantage of not shrinking the log. There is no point to back up both at the same time. If you're doing a full backup, you can just truncate the log.
If you're going to be able to restore to any point of time, you will have to do a full backup once a day (say) and back up the log few times during the day. See http://msdn.microsoft.com/en-us/library/ms191429(SQL.90).aspx
I have currently 200+ GB database that is using the DB2 built in backup to do a daily backup (and hopefully not restore - lol) But since that backup now takes more than 2.5 hours to complete I am looking into a Third party Backup and Restore utility. The version is 8.2 FP 14 But I will be moving soon to 9.1 and I also have some 9.5 databases to backup and restore. What are the best tools that you have used for this purpose?
Thanks!
One thing that will help is going to DB2 version 9 and turn on compression. The size of the backup will then decrease (by up to 70-80% on table level) which should shorten the backup time. Of course, if your database is continuosly growing you'll soon run into problems again, but then data archiving might be the thing for you.
Before looking at third party tools, which I doubt would help too much, I would consider a few optimizations.
1) Have you used REORG on your tables and indexes? This would compact the information and minimize the amount of pages used;
2) If you can, backup on multiple disks at the same time. This can easily be achieved by running db2 backup db mydb /mnt/disk1 /mnt/disk2 /mnt/disk3 ...
3) DB2 should do a good job at fine tuning itself, but you can always experiment with the WITH num_buffers BUFFERS, BUFFER buffer-size and PARALLELISM n options. But again, usually DB2 does a better job on its own;
4) Consider performing daily incremental backups, and a full backup once on Saturdays or Sundays;
5) UTIL_IMPACT_PRIORITY and UTIL_IMPACT_LIM let you throttle the backup process so that it doesn't affect your regular workload too much. This is useful if your main concern is not the time per se, but rather the performance of your datasever while you backup;
6) DB2 9's data compression can truly do wonders when it comes to reducing the dimensions of the data that needs to be backed up. I have seen very impressive results and would highly recommend it if you can migrate to version 9.1 or, even better, 9.5.
There really are only two ways to make backup, and more important recovery, run faster:
1. backup less data and/or
2. have a bigger pipe to the backup media
I think you got a lot of suggestions on how to reduce the amount of data that you back up. Basically, you should be creating a backup strategy that relies on relatively infrequent full backup and much more frequent backups of changed (since last full backup) data. I encourage you to take a look at the "Configure Automatic Maintenance" wizard in the DB2 Control Center. It will help you with creating automatic backups and with other utilities like REORG that Antonio suggested. Things like compression obviously can help as the amount of data is much lower. However, not all DB2 editions offer compression. For example, DB2 Express-C does not. Frankly, doing compression on a 200GB database may not be worth while anyway and that is precisely why free DBMS like DB2 Express-C don't offer compression.
As far as openign a bigger pipe for your backup you first have to decide if you are going to backup to disk or to tape. There is a big difference in speed (obviously disk is a lot faster). Second, DB2 can paralelize backups. So, if you have multiple devices to back to, it will backup to all of them at the same time i.e. your elapsed time will be a lot less depending how many devices you have to throw at the problem. Again, DB2 Control Center can help you have it set up.
Try High Performance Unload (HPU) - this was a standalone product from Infotel is now available as part of the Optim data studio - posting here https://www.ibm.com/developerworks/mydeveloperworks/blogs/idm/date/200910?lang=en
It's not a "third-party" product but anyone that I have ever seen using DB2 is using Tivoli Storage Manager to store their database backups.
Most shops will set up archive logging to TSM so you only have to take the "big" backup every week or so.
Since it's also an IBM product you won't have to worry about it working with all the different flavors of DB2 that you have.
The downside is it's an IBM product. :) Not sure if that ($) makes a difference to you.
I doubt that you can speed things up using another backup tool. As Mike mentions, you can add TSM to the stack, but that will hardly make the backup run any faster.
If I were you, I'd look into where backup files are stored: Are they using the same disk spindles as the database itself? If so: See if you can store the backup files on a storage area which isn't contented for access during your backup window.
And consider using incremental backups for daily backups, and then a long full backup on Saturdays.
(I assume that you are already running online backups, so that your data aren't unavailable during backup.)
A third party backup package probably won't help your speed much. Making sure that you are not doing full backups every 2 hours is probably the first step.
After that, look at where you are writing your backup to. Is it a local drive, instead of a network drive? Are the spindles used for anything else? Backups don't involve a lot of seek activity, but do involve a lot of big writes, so you probably want to avoid RAID 5 and go for large stripe sizes to help maximize throughput.
Naturally, you have to do full backups sooner or later, but hopefully you can find a window when load is light and you can live with a longer time period between backups. Do your full backup during a 4-6 hour period when the normal incrementals are off and then do incrementals based off of that the rest of the time.
Also, until you get your backup copied to a completely separate system you really aren't backed up. You'll have to experiment to figure out if you're better off compressing it before, during or after sending.