SQL Server: why are sizes of different backups the same? - sql-server

I am running a script every day (only 2 days so far) to back up my database:
sqlcmd -E -S server-hl7\timeclockplus -i timeclockplus.sql
move "C:\Program Files\Microsoft SQL Server\MSSQL.2\MSSQL\Backup\*.*" w:\
Why is it that backups from two different dates have the SAME EXACT size in bytes?? I know for a fact that the database was definitely changed!

The database files (*.mdf, *.ldf) are allocated in chunks - e.g. they're using a given number of megabytes, until that space is filled up, and then another chunk (several megabytes) is allocated.
It would be really bad for performance to allocate every single byte you ever add to the database.
Due to this chunk-based allocation, it's absolutely normal to have a given size for a certain period of time - even if your database is being used and data is being added and deleted.

A SQL Server backup only contains pages of data. A page is 8k. If your changes day to day do not add or remove pages (eg deleting, adding) then the number of pages to backup stays constant.
Try a CRC check on the backup files to see what changes...

Related

SQLite3 using another hard drive for creating index

I have two hard disks: C (40GB capacity left) and D (1TB capacity left).
My sqlite folder (SQLite3 Windows download files from tutorial) is in disk D.
I created a database called myDatabase.db in the sqlite folder and have created a table in it and populated the table from a CSV file. This was done successfully as I ran a few queries and they worked.
The size of the database is quite large (50GB) and I want to create an index for my table. I do the CREATE INDEX command and it starts - it creates a myDatabase.db-journal file in the folder next to the .db file.
However, from "This PC" view of the hard drives I can see that disk C is getting drained (from 40GB, going 39 38 etc incrementally), myDatabase.db in drive D is not getting bigger.
I dont want SQLite to use C when it doesnt make sense for it do it as sqlite and .db file are in disk D.
Any suggestions why this is happening ?
Thanks in advance for your time.

Date in NLog file name and limit the number of log files

I'd like to achieve the following behaviour with NLog for rolling files:
1. prevent renaming or moving the file when starting a new file, and
2. limit the total number or size of old log files to avoid capacity issues over time
The first requirement can be achieved e.g. by adding a timestamp like ${shortdate} to the file name. Example:
logs\trace2017-10-27.log <-- today's log file to write
logs\trace2017-10-26.log
logs\trace2017-10-25.log
logs\trace2017-10-24.log <-- keep only the last 2 files, so delete this one
According to other posts it is however not possible to use date in the file name and archive parameters like maxArchiveFiles together. If I use maxArchiveFiles, I have to keep the log file name constant:
logs\trace.log <-- today's log file to write
logs\archive\trace2017-10-26.log
logs\archive\trace2017-10-25.log
logs\archive\trace2017-10-24.log <-- keep only the last 2 files, so delete this one
But in this case every day on the first write it moves the yesterday's trace to archive and starts a new file.
The reason I'd like to prevent moving the trace file is because we use Splunk log monitor that is watching the files in the log folder for updates, reads the new lines and feeds to Splunk.
My concern is that if I have an event written at 23:59:59.567, the next event at 00:00:00.002 clears the previous content before the log monitor is able to read it in that fraction of a second.
To be honest I haven't tested this scenario as it would be complicated to set up as my team doesn't own Splunk, etc. - so please correct me if this cannot happen.
Note also I know that it is possible to directly feed Splunk other ways like via network connection, but the current setup for Splunk at our company is reading from log files so it would be easier that way.
Any idea how to solve this with NLog?
When using NLog 4.4 (or older) then you have to go into Halloween mode and make some trickery.
This example makes hourly log-files in the same folder, and ensure archive cleanup is performed after 840 hours (35 days):
fileName="${logDirectory}/Log.${date:format=yyyy-MM-dd-HH}.log"
archiveFileName="${logDirectory}/Log.{#}.log"
archiveDateFormat="yyyy-MM-dd-HH"
archiveNumbering="Date"
archiveEvery="Year"
maxArchiveFiles="840"
archiveFileName - Using {#} allows the archive cleanup to generate proper file wildcard.
archiveDateFormat - Must match the ${date:format=} of the fileName (So remember to correct both date-formats, if change is needed)
archiveNumbering=Date - Configures the archive cleanup to support parsing of filenames as dates.
archiveEvery=Year - Activates the archive cleanup, but also the archive file operation. Because the configured fileName automatically ensures the archive file operation, then we don't want any additional archive operations (Ex. avoiding generating extra empty files at midnight).
maxArchiveFiles - How many archive files to keep around.
With NLog 4.5 (Still in BETA), then it will be a lot easier (As one just have to specify MaxArchiveFiles). See also https://github.com/NLog/NLog/pull/1993

when database has multiple log files , which one SQL Server will pick?

one of my database has multiple log files
log.ldf - 40 GB ( on D: drive)
log2.ldf -70 GB (on S: Drive)
log3.ldf -100 GB ( on L:Drive)
which log file SQL Server will pick first. is SQL server will follow any order to pick the log file ?Can we control this ?
I believe you can't control into which file LOG info will be written.
You should not concentrate only on the BIGGEST file, but on the FASTEST
General advise would be to have ONLY two files:
- First file, as big as possible on the FASTEST drive (on SSD). Set MAXSIZE to the file size, so it won't grow anymore.
- Second file, as small as possible on big drive, where it can grow in case the first file is full.
Your task would be to monitor your second file size and if it starts to grow then make log backups more often and shrink that file back.
If you want to see how your log files are used, you can use following DBCC command:
DBCC LOGINFO ();

Get the actual DB size from the backup file without restore

Is there any way to get the size of the DB without restoring a backup file?
For example: I have a backup file of 10 GB, I want to know the size of the DB after the backup file will be restored. Most of the times the DB size is much larger than its backup file because of free spaces in DB. So is there anyway to know the DB size without restoring only from backup file?
Yes, you can use RESTORE FILELISTONLY to get the size like below
RESTORE FILELISTONLY FROM DISK = N'D:\backup_filename.bak'
It doesn't actually restore rather returns a result set containing a list of the database and log files contained in the backup set in SQL Server. Result includes Size column which gives the size in bytes.
Size numeric(20,0) Current size in bytes.

Replay a file-based data stream

I have a live stream of data based on files in different formats. Data comes over the network and is written to files in certain subdirectories in a directory hierarchy. From there it is picked up and processed further. I would like to replay e.g. one day of this data stream for testing and simulation purposes. I could duplicate the data stream for one day to a second machine and „record“ it this way, by just letting the files pile up without processing or moving them.
I need something simple like a Perl script which takes a base directory, looks at all contained files in subdirectories and their creation time and then copies the files at the same time of the day to a different base directory.
Simple example: I have files a/file.1 2012-03-28 15:00, b/file.2 2012-03-28 09:00, c/file.3 2012-03-28 12:00. If I run the script/program on 2012-03-29 at 08:00 it should sleep until 09:00, copy b/file.2 to ../target_dir/b/file.2, then sleep until 12:00, copy c/file.3 to ../target_dir/c/file.3, then sleep until 15:00 and copy a/file.1 to ../target_dir/a/file.1.
Does a tool like this already exist? It seems I’m missing the right search keywords to find it.
The environment is Linux, command line preferred. For one day it would be thousands of files with a few GB in total. The timing does not have to be ultra-precise. Second resolution would be good, minute resolution would be sufficient.

Resources