What content do -shm and -wal files contain after running checkpoint? - c

I am taking the backup of SQLite DB using cp command after running wal_checkpoint(FULL). The DB is being used in WAL mode so there are other files like -shm and -wal in my folder.
When I run wal_checkpoint(FULL), the changes in WAL file get committed to the database. I was wondering whether -wal and -shm files get deleted after running a checkpoint. If not, then what do they contain ?
I know my backup process is not good since I am not using SQLite backup APIs. This is a bug in my code.

After searching through numerous sources, I believe the following to be true:
The -shm file contains an index to the -wal file. The -shm file improves performance when reading the -wal file.
If the -shm file gets deleted, it get created again during next database access.
If checkpoint is run, the -wal file can be deleted.
To perform safe backups:
It is recommended that you use SQLite backup functions for making backups. SQLite library can even make backups of an online database.
If you don't want to use (1), then the best way is to close the database handles. This ensures a clean and consistent state of the database file, and deletes the -shm and -wal files. A backup can then be made using cp, scp etc.
If the SQLite database file is intended to be transmitted over a network, then the vacuum command should be run after checkpoint. This removes the fragmentation in the database file thereby reducing its size, so you transfer less data through network.

The -shm file does not contain any permanent data.
When the last connection is closed, the database is automatically checkpointed, and the -wal file is then deleted.
This implies that after a checkpoint, and if no other connections exist, the -wal file does not contain any important data.
If possible, you should close the connection before taking the backup.

Related

Backup Transaction Log since last full backup (Striped)

I am attempting to do a poor mans log shipping, manually. I need to move a rather large database from one host, to a VM in Azure. The database is currently 16Gb, and we need to do the switch over within an hour.
Note that Replication, Mirroring and Log Shipping are not options due to issues we have run into, and this manual process is what I need to try and achieve.
What I am attempting is to do a full backup the database a few weeks before, using striping of the bak files (So that if a copy fail happens, it's not the full 16Gb we need to restart the copy of), and copy those files of the full backup over a period of a few days.
For my full backup, I am trying to span the backup file over 15 files:
BACKUP DATABASE MyProductionDatabase
TO
DISK='E:\BackupTrial\V5_DEV_FULL_01.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_02.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_03.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_04.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_05.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_06.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_07.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_08.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_09.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_10.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_11.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_12.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_13.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_14.bak',
DISK='E:\BackupTrial\V5_DEV_FULL_15.bak'
WITH FORMAT,
MEDIANAME = 'V5_DEV_FullBackup',
MEDIADESCRIPTION = 'Striped media set for V5_DEV database';
GO
Then restore that database on the new VM in Azure (Virtual Machine, with SQL Server on it, as I am not sure I can go direct BAK file to AzureSQL).
And then daily or weekly (The database has low usage), do log file backups and copy those backups to the new host and restore. So basically, manually trying to keep the databases in sync.
On go-live day, do a final log file backup on the current host, copy that final bak file to the new host, and then both DBs should be in sync, with only small daily copies.
Then switch the site to use the new DB.
But I am stuck:
The MS example is:
BACKUP LOG AdventureWorks2012
TO MyAdvWorks_FullRM_log1;
GO
And it says:
This example creates a transaction log backup for the
AdventureWorks2012 database to the previously created named backup
device, MyAdvWorks_FullRM_log1.
I'm unsure what they mean by "previously created backup device". I have numerous filenames, because of the striping. How do I do a log file backup, of my full backup?
There may be other backup processes that go on on this database. Will this cause me issues with the log file backups (Another full backup of the database might clear my transaction log before I get my backup done?) How should my daily log file backup look?
I'm unsure what they mean by "previously created backup device".
This means that in tis example they previously created virtual backup device.
So now instead of using syntax backup...**to disk** =... they use
backup ...to <device_name>
You can completely ignore this and backup log to differen files as all the world do:
BACKUP LOG AdventureWorks2012
TO disk = 'V:\backups\log\AdventureWorks2012_20190918.trn;
I have numerous filenames, because of the striping. How do I do a log file backup, of my full backup?
Log backup is not done "of full backup".
First you create a "base" by taking full backup, and it doesn't matter if you stripe it into many files or not, then you take log backup (log backups generally are pretty small and you don't need to strip)
There may be other backup processes that go on on this database. Will
this cause me issues with the log file backups (Another full backup of
the database might clear my transaction log before I get my backup
done?)
Full backups are not a problem for you as they just create other "bases" from which you can start your restore chain. You will have problems if others take log backups, as it will interrupt log backup chain (it's not full backup to clear the log, log truncation is done only by log backups).
You should talk to those who take these log backups if they can take them with copy_only: Copy-Only Backups (SQL Server)
or if they can leave these backups for you on some disk.

SQL recovery - partitioned table (file groups) - each on separate disk (and one of disk crashes)

For huge tables I am thinking of using the concept of partitioning using file groups and have each file group on a separate disk. My question is that -
If one of the file group disk crashes, then should this incident be treated as database crash? Will it cause the database to stop working?
Will the restore operation (assuming full backup was taken) automatically create the file groups as configured before the crash?
If one of the file group disk crashes, then should this incident be
treated as database crash? Will it cause the database to stop working?
If all your data from that disk is in memory, you'll not even notice this crash.
Until a checkpoint attempts to write on that disk, or you need to read from it a new portion of data(if it's not in memory) you'll be able to work without any error.
Will the restore operation (assuming full backup was taken)
automatically create the file groups as configured before the crash?
Your question is not clear.
You can restore certain filegroups from full backup, but if these filegroups are not readonly you'll not be able to reconcile them with the rest of your database. It will be
only possible if your db is in full recovery model and you take and restore tail of the log backup after your full backup (and may be other log backups in between).
Here you can read more in details Piecemeal Restores (SQL Server)

Backup Strategy for database and file system

My System is dependent on both database and file system and my file system is more than 100 gb. So while taking backup of both database and file at the same time, database backup completed within minutes and file system takes hours. In the mean time real users are working on it, add and delete file. So my backup become inconsistent with respect to the database backup.
Could anyone tell me how we can take a backup of both database and file system, to make both compatible and consistent state

Restore SQL Server DB without transaction log

Given a SQL Server 2008 .bak file, is there a way to restore the data file only from the .bak file, without the transaction log?
The reason I'm asking is that the transaction log file size of this database is huge - exceeding the disc space I have readily available.
I have no interest in the transaction log, and no interest in any uncompleted transactions. Normally I would simply shrink the log to zero, once I've restored the database. But that doesn't help when I have insufficient disc space to create the log in the first place.
What I need is a way to tell SQL Server to restore only the data from the .bak file, not the transaction log. Is there any way to do that?
Note that I have no control over the generation of the .bak file - it's from an external source. Shrinking the transaction log before generating the .bak file is not an option.
The transaction log is an integral part of the backup. You can't tell SQL Server to ignore the transaction log, because there is no way to let's say restore and shrink the transaction log file at the same time. However, you can take a look at DBA post to hack the process, although it is not recommended at all
Alternatively you could try some third party tools for restoring, particularly virtual restoring process that can save a lot of space and time. Check out ApexSQL Restore, RedGate Virtual Restore, Idera Virtual Database.
Disclaimer: I work for ApexSQL as support engineer
No, the transaction log is required.
Option 1:
An option may be to restore it to a machine that you DO have enough space on. Then on the restored copy change the logging to either bulk logged or simple, shrink the logs, do another backup operation on this new copy and then use that to restore to the target machine with the now much smaller transaction log.
Option 2:
Alternatively, perhaps the contact at the external source could shrink the transaction log before sending it to you (this may not work if the log is large due to a lot of big transactions).
Docs on the command to shrink the log file are available here.
This is really a question for the ServerFault or DBA sites, but the short answer is no, you can only restore the full .bak file (leaving aside 'exotic' scenarios such as filegroup or piecemeal restores). You don't say what "huge" means, but disk space is cheap; if adding more really isn't an option then you need to find an alternative way of getting the data from your external source.
This may not work since you have no control over the generation of the .bak file, but if you could convince your source to detach the database and then send you a copy of the .mdf file directly, you could then attach the .mdf and your server would automatically create a new empty transaction log file.
See sp_detach_db and sp_attach_db (or CREATE DATABASE database_name FOR ATTACH depending on your sql server version).
I know this is an old thread now, but i stumbled across it while I was having transactional log corruption issues, here is how I got around it without any data loss (I did have down time though!)
Here is what I did:--
Stop the sql server instance service
make a copy of the affected database .mdf file and .ldf file (if you have an .ndf file, copy that as well!) - Just to be sure, you can always put these back if it doesn't work for you.
restart the service.
Log into sql management studio and change the database mode to simple, then take a full backup.
Change the database type back again and once again take a full backup, then take a transactional log backup.
Detach the database.
Right click on databases and click on restore, select the database name from the drop down list, select the later full database backup created (not the one taken from the simple mode) and also select the transactional log backup.
Click restore and it should put it all back without any corruption in the log files.
This worked for me with no errors and my backups all worked correctly afterwards and there were no more transactional log errors.

export a large db with terabytes of data

what's the best way to dump a large(terabytes) db? are there other faster/efficient way besides mysqldump? this is intended to be zipped, unzipped, and then reimported into another mysql db on another server.
If it's possible for you to stop the database server, the best way is probably for you to:
Stop the database
Do a file copy of the files (including appropriate transaction logs, etc) to a new file system.
Restart the database.
Then move the copied files to the new server and bring up the database on top of the files. It's a bit complicated to do this, but it's by far the fastest way.
I used to be a DBA for a terabyte+ database in MySQL and this is one of the ways we'd do nightly backups of the database. mysqldump would've never worked for data that large. We'd stop the database each night and file copy the underlying files.
Since your intent seems to be having two copies of the DB, why not set up replication to do this?
That will ensure that both copies of the DB remain in an identical state (in terms of data anyway).
And, if you want a snapshot to be exported, you can:
wait for a quiet time.
disable replication.
back up the slave copy.
re-enable replication.

Resources