Backup Strategy for database and file system - database

My System is dependent on both database and file system and my file system is more than 100 gb. So while taking backup of both database and file at the same time, database backup completed within minutes and file system takes hours. In the mean time real users are working on it, add and delete file. So my backup become inconsistent with respect to the database backup.
Could anyone tell me how we can take a backup of both database and file system, to make both compatible and consistent state

Related

Backup Transaction Log since last full backup (Striped)

I am attempting to do a poor mans log shipping, manually. I need to move a rather large database from one host, to a VM in Azure. The database is currently 16Gb, and we need to do the switch over within an hour.
Note that Replication, Mirroring and Log Shipping are not options due to issues we have run into, and this manual process is what I need to try and achieve.
What I am attempting is to do a full backup the database a few weeks before, using striping of the bak files (So that if a copy fail happens, it's not the full 16Gb we need to restart the copy of), and copy those files of the full backup over a period of a few days.
For my full backup, I am trying to span the backup file over 15 files:
BACKUP DATABASE MyProductionDatabase
TO
DISK='E:\BackupTrial\V5_DEV_FULL_01.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_02.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_03.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_04.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_05.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_06.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_07.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_08.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_09.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_10.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_11.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_12.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_13.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_14.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_15.bak'
WITH FORMAT,
MEDIANAME = 'V5_DEV_FullBackup',
MEDIADESCRIPTION = 'Striped media set for V5_DEV database';
GO
Then restore that database on the new VM in Azure (Virtual Machine, with SQL Server on it, as I am not sure I can go direct BAK file to AzureSQL).
And then daily or weekly (The database has low usage), do log file backups and copy those backups to the new host and restore. So basically, manually trying to keep the databases in sync.
On go-live day, do a final log file backup on the current host, copy that final bak file to the new host, and then both DBs should be in sync, with only small daily copies.
Then switch the site to use the new DB.
But I am stuck:
The MS example is:
BACKUP LOG AdventureWorks2012
TO MyAdvWorks_FullRM_log1;
GO
And it says:
This example creates a transaction log backup for the
AdventureWorks2012 database to the previously created named backup
device, MyAdvWorks_FullRM_log1.
I'm unsure what they mean by "previously created backup device". I have numerous filenames, because of the striping. How do I do a log file backup, of my full backup?
There may be other backup processes that go on on this database. Will this cause me issues with the log file backups (Another full backup of the database might clear my transaction log before I get my backup done?) How should my daily log file backup look?
I'm unsure what they mean by "previously created backup device".
This means that in tis example they previously created virtual backup device.
So now instead of using syntax backup...**to disk** =... they use
backup ...to <device_name>
You can completely ignore this and backup log to differen files as all the world do:
BACKUP LOG AdventureWorks2012
TO disk = 'V:\backups\log\AdventureWorks2012_20190918.trn;
I have numerous filenames, because of the striping. How do I do a log file backup, of my full backup?
Log backup is not done "of full backup".
First you create a "base" by taking full backup, and it doesn't matter if you stripe it into many files or not, then you take log backup (log backups generally are pretty small and you don't need to strip)
There may be other backup processes that go on on this database. Will
this cause me issues with the log file backups (Another full backup of
the database might clear my transaction log before I get my backup
done?)
Full backups are not a problem for you as they just create other "bases" from which you can start your restore chain. You will have problems if others take log backups, as it will interrupt log backup chain (it's not full backup to clear the log, log truncation is done only by log backups).
You should talk to those who take these log backups if they can take them with copy_only: Copy-Only Backups (SQL Server)
or if they can leave these backups for you on some disk.

SQL recovery - partitioned table (file groups) - each on separate disk (and one of disk crashes)

For huge tables I am thinking of using the concept of partitioning using file groups and have each file group on a separate disk. My question is that -
If one of the file group disk crashes, then should this incident be treated as database crash? Will it cause the database to stop working?
Will the restore operation (assuming full backup was taken) automatically create the file groups as configured before the crash?
If one of the file group disk crashes, then should this incident be
treated as database crash? Will it cause the database to stop working?
If all your data from that disk is in memory, you'll not even notice this crash.
Until a checkpoint attempts to write on that disk, or you need to read from it a new portion of data(if it's not in memory) you'll be able to work without any error.
Will the restore operation (assuming full backup was taken)
automatically create the file groups as configured before the crash?
Your question is not clear.
You can restore certain filegroups from full backup, but if these filegroups are not readonly you'll not be able to reconcile them with the rest of your database. It will be
only possible if your db is in full recovery model and you take and restore tail of the log backup after your full backup (and may be other log backups in between).
Here you can read more in details Piecemeal Restores (SQL Server)

Unable to restore SQL Server bak file from S3, says file too large

I'm trying to run a restore query from a bak file stored in S3 bucket to an RDS SQL Server Web edition, and kept getting this error:
[2017-09-13 20:30:22.227] Aborted the task because of a task failure or a concurrent RESTORE_DB request. [2017-09-13 20:30:22.287] There is not enough space on the disk to perform restore database operaton.
The bak file is 77 GB and the DB has 2TB, how come this is still not enough?
This is the query from AWS docs:
exec msdb.dbo.rds_restore_database
#restore_db_name='database_name',
#s3_arn_to_restore_from='arn:aws:s3:::bucket_name/file_name_and_extension';
Source: http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/SQLServer.Procedural.Importing.html#SQLServer.Procedural.Importing.Native.Using
There is not enough space on the disk to perform restore database
operaton.
...
The bak file is 77 GB and the DB has 2TB, how come this is still not
enough?
You need 2Tb of space to be able to restore this backup.
The fact is that restore operation will reconstruct your original database that is 2Tb.
Backup is backing up only data, not empty space. If your backup is only 77Gb and is not compressed this means that your original database has only 77Gb of data (or even less, because backup contains a certain amout of log as well).
Any database consists of data file(s) and log file(s), and if your db is about 2Tb with only 77Gb of data, it means it has enormous log file. I think it's in full recovery model and someone does not take regular log backups (or even did not take any log backup at all!!)
So you should take a look at your original db, change recovery model to simple if you don't need point in time recovery and take no log backups, or, if you really need full recovery model, you should backup log more frequently.
Taking regular log backups or changing recovery model to simple will permit you to shrink the log to reasonable size, from that moment you will not need 2Gb of space to restore it anymore

Database backup size conundrum

I have a small SQL Server 2005 database. I take daily backup (automated) and the size of the .bak file typically comes out to be 400MB, growing by 5MB every day (which is inline with its usages).
Last night the size of the backup file jumped to 1GB. Suspecting that someone was trying to fill the database with garbage data, I ran a report (Reports -> Standard Reports -> Disk User By Top Tables) and the total size came out to be around 400MB.
Then thinking maybe something was wrong with the automated backup process, I immediately took backup again but the .bak file came out to be over 1GB. Before this automated backup yesterday, an automated task that defragments indexes also ran. However, all these months, a backup after this index optimization used to actually reduce the size of the .bak file.
I am trying to find an explanation for this big jump in size overnight and also why the .bak file is more than double the size of the the actual database disk usages?
UPDATE: I ran
DBCC SHRINKDATABASE(mydb)
to remove transaction logs. Then took a backup again. The size of the .bak file came out even bigger than last time.
This is the query I ran:
DBCC SHRINKFILE(mydb_log, 1)
BACKUP LOG mydbWITH TRUNCATE_ONLY
DBCC SHRINKFILE(mydb_log, 1)
1 - How are you backing up the data? Maintenance plans?
2 - If you are appending to the same database backup file, the backup will grow!
Check out the contents of the file.
RESTORE FILELISTONLY FROM AdventureWorksBackups WITH FILE=1;
http://technet.microsoft.com/en-us/library/ms173778.aspx
This assumes you have a dump device named AdventureWorksBackups. You can also change it to a DISK='AdventureWorks.bak'.
3 - Also, the maintenance plans do not do a good job with determining when to re-organize/update stats versus rebuild an index.
4 - Check out the ola hallengren scripts. They are way better!
http://ola.hallengren.com/
First, they create a directory structure for you.
c:\backup\<server name>\<database name>\full
c:\backup\<server name>\<database name>\diff
c:\backup\<server name>\<database name>\log
Second, each backup has a date time stamp. No appending to backup files.
Third, they clean up after themselves by passing the number of hours to keep on-line.
Fourth, they handle index fragmentation better - 5-30 = re-organize or 30+ = rebuild.
I usually set up the following for my databases.
1 - system databases - full backup every night, log backups hourly.
2 - user databases - full backup 1 x week, diff backup x 6 days, log backups hourly
Last but not least, SQL Server 2005 does not have native support for compressed backups. This does not mean you can not run a batch file to zip them up afterwards.
Third party tools like QUEST (DELL) and RED GATE used to support their own backup utility. The main reason was to fill this gap. Since SQL Server 2008, compressed backups were available. I think many of the vendors are getting rid of this utility since it is now standard.

SQL Server 2005 Backup strategy

I manage a web application for a client with the following specs:
ASP.net 3.5 running on a Virtual Windows 2003 Web Server
SQL Server Standard hosting the database
Database current size of 6Gb, with 1Gb/month growth rate
One single table is responsible for 98% of the size, holds the most critical data for the client
Log is not kept for this big table, only selects are done in this table
50 Gb FTP space avaiable for backup
Considering this scenario, what would be the best strategy for a SQL Backup and what tool would be best suited for this task (commercial applications included, client can pay for the license fee)?
Here is the strategy we use for CodePlex.com:
All SQL servers run with a peer server using SQL mirroring
Weekly full backup (stored on separate drive from databases)
Daily differential backup (stored on separate drive from databases)
Transaction log backup every 5 minutes (stored on separate drive from databases)
Daily tape backup
Tape backups taken offsite weekly
Also very important test your backups! Studies have shown that over 30% of untested backup procedures are flawed. Here is our backup testing strategy:
Every 30 minutes verify the full backup file exists (using scheduled task)
Every 30 minutes verify the differential backup file exists (using scheduled task)
Every 30 minutes verify the transaction log backup file exists (using scheduled task)
Every 30 minutes verify the database mirroring is configured (using scheduled task)
Every day, do a test restore of the full+differential backup and report the table row counts (using scheduled task)
Once a month do a test restore of the most recent tape backup and verify the data
It depends how critical is the data. Here is however how I i'd do it.
1. Run a full backup every day.
2. Run a differential backup every 4 hours.
3. Run a transactional log backup every 15 minutes
4. Keep a copy at the site and move a copy off the site as well as soon as the backup is done.
The database is not too big, and this is easily doable.
Use a third party tool like Redgate SQL Backup and it will automatically compress and encrypt the database backup for you. I have used it extensively and am a big fan.
Additionally if you another site available, and the data is very critical, you might want to think about setting up log shipping as well.
This is a VPC? Can you install apps?
http://www.jungledisk.com/
That's what we use - make a sql job that pushes out a backup every day, then use that service to push a copy back to Amazons S3 service. If not maybe you could have a local app that pulls the backup to a machine then pushes it /w S3 webservice, or still using Jungledisk.
This is important! If your app goes down it hurts! Also make sure you backup your deployed app and resources stored there... i.e. uploaded content to your apps storage directory.
I was supposed to type in my answer to your question but I realized there are lots of far greater resources somewhere like this article in SQLServerCentral.com. You can also find lots of "Best Practices on Backup" like this one.
You might also want to take into consideration how much data you can afford to lose and how long it will take you to restore the database. Your client may decide that they never want to lose more than 15 minutes of data ever, or they may decide that losing up to a days worth of data is okay with them.

Resources