SQL Server - Tempdb vs. Database Log usage

SQL Server - Tempdb vs. Database Log usage - sql-server

This may be a very basic question, but how can you determine beforehand whether a large operation will end up using database log or tempdb space?
For instance, one large insert / update operation I did used the database log to a point where we needed to employ SSIS & bulk operations just so the space wouldn't run out, because all the changes in the script had to be deployed at one time.
So now I'm working with a massive delete operation, that would fill the log 10 times over. So I created a script to check the space used by the database log file and delete the rows in smaller batches, with the idea that once the log file was large enough, the script would abort and then continue from that point the next day (allowing normal usage to continue till the next backup, without risk of the log running out of space).
Now, instead of filling the log, the latter query started filling up tempdb. Tempdb data file, not log file, to be specific. So I'm thinking there's a huge hole where my understanding of these two should be. :)
Thanks for any advice!
Edit:
To clarify, the question here is that why does the first example use database log, while the latter uses tempdb data file, to store the changes? And in general, by which logic are DML operations stored to either tempdb or log? Normally log should store all DB changes while tempdb is only used to store the processed data during operation when explicitly requested (ie, temp objects) or when the server runs out of RAM, right?

There is actually quite a bit that goes on behind the scenes when deleting records from a table. This MSDN Blog link may help shed some light on why tempdb is filling up when you try and delete. Either way, the delete will fill up the transaction logs as well, it just sounds like tempdb is filling up before it gets to the step of logging the transaction(s).
I'm not entirely sure what your requirements are, but the following links could be somewhat enlightening on your transaction logging issues. These are all set for SQL Server 2008 R2, but you can switch to whatever version you are running.
Recovery Model Overiew
Considerations for Switching from the Simple Recovery Model
Considerations for Switching from the Full or Bulk-Logged Recovery Model
You also have the option of truncating the table, but that depends on a few things. If you don't need the operation to be logged and you're deleting all the records from the table you can truncate. If you are doing some sort of conditional delete, but you're deleting more than you're keeping, you could always insert all of the records you want to keep into another "staging" table and then truncate the original. Then you can re-insert the records into the staging table. However, that really only works when you have no foreign key relationships on that table.

Related

Disable transactions on SQL Server

I need some light here. I am working with SQL Server 2008.
I have a database for my application. Each table has a trigger to stores all changes on another database (on the same server) on one unique table 'tbSysMasterLog'. Yes the log of the application its stored on another database.
Problem is, before any Insert/update/delete command on the application database, a transaction its started, and therefore, the table of the log database is locked until the transaction is committed or rolled back. So anyone else who tries to write in any another table of the application will be locked.
So...is there any way possible to disable transactions on a particular database or on a particular table?

You cannot turn off the log. Everything gets logged. You can set to "Simple" which will limit the amount of data saved after the records are committed.

" the table of the log database is locked": why that?
Normally you log changes by inserting records. The insert of records should not lock the complete table, normally there should not be any contention in insertion.
If you do more than inserts, perhaps you should consider changing that. Perhaps you should look at the indices defined on log, perhaps you can avoid some of them.

It sounds from the question that you have a create transaction at the start of your triggers, and that you are logging to the other database prior to the commit transaction.
Normally you do not need to have explicit transactions in SQL server.
If you do need explicit transactions. You could put the data to be logged into variables. Commit the transaction and then insert it into your log table.
Normally inserts are fast and can happen in parallel with out locking. There are certain things like identity columns that require order, but this is very lightweight structure they can be avoided by generating guids so inserts are non blocking, but for something like your log table a primary key identity column would give you a clear sequence that is probably helpful in working out the order.
Obviously if you log after the transaction, this may not be in the same order as the transactions occurred due to the different times that transactions take to commit.
We normally log into individual tables with a similar name to the master table e.g. FooHistory or AuditFoo
There are other options a very lightweight method is to use a trace, this is what is used for performance tuning and will give you a copy of every statement run on the database (including triggers), and you can log this to a different database server. It is a good idea to log to different server if you are doing a trace on a heavily used servers since the volume of data is massive if you are doing a trace across say 1,000 simultaneous sessions.
https://learn.microsoft.com/en-us/sql/tools/sql-server-profiler/save-trace-results-to-a-table-sql-server-profiler?view=sql-server-ver15
You can also trace to a file and then load it into a table, ( better performance), and script up starting stopping and loading traces.
The load on the server that is getting the trace log is minimal and I have never had a locking problem on the server receiving the trace, so I am pretty sure that you are doing something to cause the locks.

Transaction Size Limit in SQL Server

I'm loading large amounts of data from a text file into SQL Server. Currently each record is inserted (or updated) in a separate transaction, but this leaves the DB in a bad state if a record fails.
I'd like to put it all in one big transaction. In my case, I'm looking at ~250,000 inserts or updates and maybe ~1,000,000 queries. The text file is roughly 60MB.
Is it unreasonable to put the entire operation into one transaction? What's the limiting factor?

It's not only not unreasonable to do so, but it's a must in case you want to preserve integrity in case any record fails, so you get an "all or nothing" import as you note. 250000 inserts or updates will be no problem for SQL to handle, but I would take a look at what are those million queries. If they're not needed to perform the data modification, I would take them out of the transaction, so they don't slow down the whole process.
You have to consider that when you have an open transaction (regardless of size), looks will occur at the tables it touches, and lengthy transactions like yours might cause blocking in other users that are trying to read them at the same time. If you expect the import to be big and time-consuming and the system will be under load, consider doing the whole process over the night (or any non-peak hours) to mitigate the effect.
About the size, there is no specific size limit in SQL Server, they can theoretically modify any amount of data without problems. The practical limit is really the size of the transaction log file of the target database. The DB engine stores all the temporary and modified data in this file while the transaction is in progress (so it can use it to roll it back if needed), so this file will grow in size. It must have enough free space in the DB properties, and enough HD space for the file to grow. Also, the row or table locks that the engine will put on the affected tables consumes memory, so the server must have enough free memory for all this plumbing too. Anyway, 60MB in size is often too little to worry about generally. 250,000 rows is considerable, but not that much too, so any decent-sized server will be able to handle it.

SQL Server can handle those size transactions. We use a single transaction to bulk load several million records.
The most expensive part of a database operation is usually the client server connection and traffic. For inserts/updates indexing and logging are also expensive, but you can mitigate those costs by using the correct loading techniques(see below). You really want to limit the amount of connections and data transfered between client and server.
To that end, you should consider bulk loading the data using SSIS or C# with SqlBulkCopy. Once you bulk load everything then you can use set based operations ON THE SERVER to update or verify your data.
Take a look at this question for more suggestions about optimizing data loads. The question is related to C# but a lot of the information is useful for SSIS or other loading methods. What's the fastest way to bulk insert a lot of data in SQL Server (C# client) .

There is no issue with doing an all or nothing bulk operation, unless a complete rollback is problematic for your business. In fact, a single transaction is the default behavior for a lot of bulk insert utilities.
I would strongly advise against a single operation per row. If you want to weed out bad data, you can load the data into a staging table first and pro grammatically determine "bad data" and skip those rows.

Well personally, I don't load imported data directly to my prod tables ever and I weed out all the records which won't pass muster long before I ever get to the point of loading. Some kinds of errors kill the import completely and others might just send the record to an exception table to be sent back to the provider and fixed for the next load. Typically I have logic that determines if there are too many exceptions and kills the package as well.
For instance suppose the city is a reuired field in your database and in the file of 1,000,000 records, you have ten that have no city. It is probably best to send them to an exception table and load the rest. But suppose you have 357,894 records with no city. Then you might need to be having a conversation with the data provider to get the data fixed before loading. It will certainly affect prod less if you can determine that the file is unuseable before you ever try to affect production tables.
Also, why are you doing this one record at a time? You can often go much faster with set-based processing especially if you have already managed to clean the data beforehand. Now you may still need to do in batches, but one record at a time can be very slow.
If you really want to roll back the whole thing if any part errors, yes you need to use transactions. If you do this in SSIS, then you can put transactions on just the part of the package where you affect prod tables and not worry about them in the staging of the data and the clean up parts.

Is there a congruent command to truncate for re-filling a table? [duplicate]

I have an INSERT statement that is eating a hell of a lot of log space, so much so that the hard drive is actually filling up before the statement completes.
The thing is, I really don't need this to be logged as it is only an intermediate data upload step.
For argument's sake, let's say I have:
Table A: Initial upload table (populated using bcp, so no logging problems)
Table B: Populated using INSERT INTO B from A
Is there a way that I can copy between A and B without anything being written to the log?
P.S. I'm using SQL Server 2008 with simple recovery model.

From Louis Davidson, Microsoft MVP:
There is no way to insert without
logging at all. SELECT INTO is the
best way to minimize logging in T-SQL,
using SSIS you can do the same sort of
light logging using Bulk Insert.
From your requirements, I would
probably use SSIS, drop all
constraints, especially unique and
primary key ones, load the data in,
add the constraints back. I load
about 100GB in just over an hour like
this, with fairly minimal overhead. I
am using BULK LOGGED recovery model,
which just logs the existence of new
extents during the logging, and then
you can remove them later.
The key is to start with barebones
tables, and it just screams. Building
the index once leaves you will no
indexes to maintain, just the one
index build per index.
If you don't want to use SSIS, the point still applies to drop all of your constraints and use the BULK LOGGED recovery model. This greatly reduces the logging done on INSERT INTO statements and thus should solve your issue.
http://msdn.microsoft.com/en-us/library/ms191244.aspx

Upload the data into tempdb instead of your database, and do all the intermediate transformations in tempdb. Then copy only the final data into the destination database. Use batches to minimize individual transaction size. If you still have problems, look into deploying trace flag 610, see The Data Loading Performance Guide and Prerequisites for Minimal Logging in Bulk Import:
Trace Flag 610
SQL Server 2008 introduces trace flag
610, which controls minimally logged
inserts into indexed tables.

Major performance difference between two Oracle database instances

I am working with two instances of an Oracle database, call them one and two. two is running on better hardware (hard disk, memory, CPU) than one, and two is one minor version behind one in terms of Oracle version (both are 11g). Both have the exact same table table_name with exactly the same indexes defined. I load 500,000 identical rows into table_name on both instances. I then run, on both instances:
delete from table_name;
This command takes 30 seconds to complete on one and 40 minutes to complete on two. Doing INSERTs and UPDATEs on the two tables has similar performance differences. Does anyone have any suggestions on what could have such a drastic impact on performance between the two databases?

I'd first compare the instance configurations - SELECT NAME, VALUE from V$PARAMETER ORDER BY NAME and spool the results into text files for both instances and use some file comparison tool to highlight differences. Anything other than differences due to database name and file locations should be investigated. An extreme case might be no archive logging on one database and 5 archive destinations defined on the other.
If you don't have access to the filesystem on the database host find someone who does and have them obtain the trace files and tkprof results from when you start a session, ALTER SESSION SET sql_trace=true, and then do your deletes. This will expose any recursive SQL due to triggers on the table (that you may not own), auditing, etc.

If you can monitor the wait_class and event columns in v$session for the deleting session you'll get a clue as to the cause of the delay. Generally I'd expect a full table DELETE to be disk bound (a wait class indication I/O or maybe configuration). It has to read the data from the table (so it knows what to delete), update the data blocks and index blocks to remove the entries which generate a lot of entries for the UNDO tablespace and the redo log.
In a production environment, the underlying files may be spread over multiple disks (even SSD). Dev/test environments may have them all stuck on one device and have a lot of head movement on the disk slowing things down. I could see that jumping an SQL maybe tenfold. Yours is worse than that.
If there is concurrent activity on the table [wait_class of 'Concurrency'] (eg other sessions inserting) you may get locking contention or the sessions are both trying to hammer the index.

Something is obviously wrong in instance two. I suggest you take a look at these SO questions and their answers:
Oracle: delete suddenly taking a long time
oracle delete query taking too much time
In particular:
Do you have unindexed foreign key references (reason #1 of delete taking a looong time -- look at this script from AskTom),
Do you have any ON DELETE TRIGGER on the table ?
Do you have any activity on instance two (if this table is continuously updated, you may be blocked by other sessions)

please note: i am not a dba...
I have the following written on my office window:
In case of emergency ask the on call dba to:
Check Plan
Run Stats
Flush Shared Buffer Pool
Number 2 and/or 3 normally fix queries which work in one database but not the other or which worked yesterday but not today....

How do I shrink my SQL Server Database?

I have a Database nearly 1.9Gb Database in size, and MSDE2000 does not allow DBs that exceed 2.0Gb
I need to shrink this DB (and many others like this at various client locations).
I have found and deleted many 100's of 1000's of records which are considered unneeded:
these records account for a large percentage of some of the main (largest) tables in the Database. Therefore it's reasonable to assume much space should now be retrievable.
So now I need to shrink the DB to account for the missing records.
I execute DBCC ShrinkDatabase('MyDB')...... No effect.
I have tried the various shrink facilities provided in MSSMS.... Still no effect.
I have backed up the database and restored it... Still no effect.
Still 1.9Gb
Why?
Whatever procedure I eventually find needs to be replayable on a client machine with access to nothing other than OSql or similar.

ALTER DATABASE MyDatabase SET RECOVERY SIMPLE
GO
DBCC SHRINKFILE (MyDatabase_Log, 5)
GO
ALTER DATABASE MyDatabase SET RECOVERY FULL
GO

This may seem bizarre, but it's worked for me and I have written a C# program to automate this.
Step 1: Truncate the transaction log (Back up only the transaction log, turning on the option to remove inactive transactions)
Step 2: Run a database shrink, moving all the pages to the start of the files
Step 3: Truncate the transaction log again, as step 2 adds log entries
Step 4: Run a database shrink again.
My stripped down code, which uses the SQL DMO library, is as follows:
SQLDatabase.TransactionLog.Truncate();
SQLDatabase.Shrink(5, SQLDMO.SQLDMO_SHRINK_TYPE.SQLDMOShrink_NoTruncate);
SQLDatabase.TransactionLog.Truncate();
SQLDatabase.Shrink(5, SQLDMO.SQLDMO_SHRINK_TYPE.SQLDMOShrink_Default);

This is an old question, but I just happened upon it.
The really short and correct answer is already given and has the most votes. That is how you shrink a transaction log, and that was probably the OP's problem. And when the transaction log has grown out of control, it often needs to be shrunk back, but care should be taken to prevent future situations of a log from growing out of control. This question on dba.se explains that. Basically - Don't let it get that large in the first place through proper recovery model, transaction log maintenance, transaction management, etc.
But the bigger question in my mind when reading this question about shrinking the data file (or even the log file) is why? and what bad things happen when you try? It appears as though shrink operations were done. Now in this case it makes sense in a sense - because MSDE/Express editions are capped at max DB size. But the right answer may be to look at the right version for your needs. And if you stumble upon this question looking to shrink your production database and this isn't the reason why you should ask yourself the why? question.
I don't want someone searching the web for "how to shrink a database" coming across this and thinking it is a cool or acceptable thing to do.
Shrinking Data Files is a special task that should be reserved for special occasions. Consider that when you shrink a database, you are effectively fragmenting your indexes. Consider that when you shrink a database you are taking away the free space that a database may someday grow right back into - effectively wasting your time and incurring the performance hit of a shrink operation only to see the DB grow again.
I wrote about this concept in several blog posts about shrinking databases. This one called "Don't touch that shrink button" comes to mind first. I talk about these concepts outlined here - but also the concept of "Right-Sizing" your database. It is far better to decide what your database size needs to be, plan for future growth, and allocate it to that amount. With Instant File Initialization available in SQL Server 2005 and beyond for data files, the cost of growth is lower - but I still prefer to have a proper initial application - and I'm far less scared of white space in a database than I am of shrinking in general with no thought first. :)

DBCC SHRINKDATABASE works for me, but this is its full syntax:
DBCC SHRINKDATABASE ( database_name, [target_percent], [truncate] )
where target_percent is the desired percentage of free space left in the database file after the database has been shrunk.
And truncate parameter can be:
NOTRUNCATE
Causes the freed file space to be retained in the database files. If not specified, the freed file space is released to the operating system.
TRUNCATEONLY
Causes any unused space in the data files to be released to the operating system and shrinks the file to the last allocated extent, reducing the file size without moving any data. No attempt is made to relocate rows to unallocated pages. target_percent is ignored when TRUNCATEONLY is used.
...and yes no_one is right, shrinking datbase is not very good practice becasue for example :
shrink on data files are excellent ways to introduce significant logical fragmentation, becasue it moves pages from the end of the allocated range of a database file to somewhere at the front of the file...
shrink database can have a lot of consequence on database, server.... think a lot about it before you do it!
on the web there are a lot of blogs and articles about it.

Late answer but might be useful useful for someone else
If neither DBCC ShrinkDatabase/ShrinkFile or SSMS (Tasks/Shrink/Database) doesn’t help, there are tools from Quest and ApexSQL that can get the job done, and even schedule periodic shrinking if you need it.
I’ve used the latter one in free trial to do this some time ago, by following short description at the end of this article:
https://solutioncenter.apexsql.com/sql-server-database-shrink-how-and-when-to-schedule-and-perform-shrinking-of-database-files/
All you need to do is install ApexSQL Backup, click "Shrink database" button in the main ribbon, select database in the window that will pop-up, and click "Finish".

You will also need to shrink the individual data files.
It is however not a good idea to shrink the databases. For example see here

You should use:
dbcc shrinkdatabase (MyDB)
It will shrink the log file (keep a windows explorer open and see it happening).

Here's another solution: Use the Database Publishing Wizard to export your schema, security and data to sql scripts. You can then take your current DB offline and re-create it with the scripts.
Sounds kind of foolish, but there are a couple advantages. First, there's no chance of losing data. Your original db (as long as you don't delete your DB when dropping it!) is safe, the new DB will be roughly as small as it can be, and you'll have two different snapshots of your current database - one ready to roll, one minified - you can choose from to back up.

"Therefore it's reasonable to assume much space should now be retrievable."
Apologies if I misunderstood the question, but are you sure it's the database and not the log files that are using up the space? Check to see what recovery model the database is in. Chances are it's in Full, which means the log file is never truncated. If you don't need a complete record of every transaction, you should be able to change to Simple, which will truncate the logs. You can shrink the database during the process. Assuming things go right, the process looks like:
Backup the database!
Change to Simple Recovery
Shrink db (right-click db, choose all tasks > shrink db -> set to 10% free space)
Verify that the space has been reclaimed, if not you might have to do a full backup
If that doesn't work (or you get a message saying "log file is full" when you try to switch recovery modes), try this:
Backup
Kill all connections to the db
Detach db (right-click > Detach or right-click > All Tasks > Detach)
Delete the log (ldf) file
Reattach the db
Change the recovery mode
etc.

I came across this post even though I needed to SHRINKFILE on MSSQL 2012 version which is little trickier since 2000 or 2005 versions. After reading up on all risks and issues related to this issue I ended up testing. Long story short, the best results I got were from using the MS SQL Server Management Studio.
Right-Click the DB -> TASKS -> SHRINK -> FILES -> select the LOG file

You also have to modify the minimum size of the data and log files. DBCC SHRINKDATABASE will shrink the data inside the files you already have allocated. To shrink a file to a size smaller than its minimum size, use DBCC SHRINKFILE and specify the new size.

Delete data, make sure recovery model is simple, then skrink (either shrink database or shrink files works). If the data file is still too big, AND you use heaps to store data -- that is, no clustered index on large tables -- then you might have this problem regarding deleting data from heaps: http://support.microsoft.com/kb/913399

I recently did this. I was trying to make a compact version of my database for testing on the road, but I just couldn't get it to shrink, no matter how many rows I deleted. Eventually, after many other commands in this thread, I found that my clustered indexes were not getting rebuilt after deleting rows. Rebuilding my indexes made it so I could shrink properly.

Not sure how practical this would be, and depending on the size of the database, number of tables and other complexities, but I:
defrag the physical drive
create a new database according to my requirements, space, percentage growth, etc
use the simple ssms task to import all tables from the old db to the new db
script out the indexes for all tables on the old database, and then recreate the indexes on the new database. expand as needed for foreign keys etc.
rename databases as needed, confirm successful, delete old

I think you can remove all your log with switch from full to simple recovery. Right click on your Database and select Properties and select Options and change
Recovery mode to Simple
Containment type to None

When you've set the recovery model to Simple (and enabled auto-shrink), it is still possible that SQL Server can not shrink the log. It has to do with checkpoints in the log (or lack thereof).
So first run
DBCC CHECKDB
on your database. After that the shrink operation should work like a charm.
Usually I use the Tasks>Shrink>Files menu and choose the logfile with the option to reorganise pages.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight