read MSSQL tables in disk order - sql-server

How to take advantage of Disk IO queueing
I need to do exactly this but on Microsoft SQL server tables.
I have a database with +100 tables.I need to read every record of every table.
Any suggestions? Is it worth writing code doing benchmarks and debugging for a few seconds?
It would be really nice if i could tell where each table resides on disk.
And because somebody will ask: Yes, This is a bottleneck in my program.

That other answer is irrelevant for SQL Server. SQL Server does IO it's own way.
Some pointers though:
Ensure every table has a clustered index
You have regular index maintenance
Ensure you have good disks underneath (RAID etc)
Use Enterprise edition for read ahead functionality if it is that critical
Ensure you have plenty of RAM
What does this mean?
If you want to use SQL Server Express on a single workstation disk, then don't bother. You can't optimise this
Having clustered indexes and index maintenance ensures that data is mostly contiguous on disk (subject to subsequent data changes)
A proxy for how long this will take would be to run DBCC CHECKDB on all table. Or ALTER INDEX rebuild. Both will require all data to be read from disk for all tables

Related

SQL Server 2005 TempDB Size

We are using SQL Server 2005. Recently SQL server 2005 crashed in our production environment due to large tempdb size.
1) what could be reason for large tempdb size?
2) Is there any way to look what data is there in tempdb?
2) Is there any way to look what data is there in tempdb?
No, because it is not kept there. Tempdb has very special treatment, like being dropped on every server restart.
1) what could be reason for large tempdb size?
Inefficient SQL, maintenance jobs or just the data at hand. Obviously a 800gb, 6000gb database may require more tempdb space than a 4gb online crm attempt. You dont really specify ANY size in absolute terms. What IS large? I have tempdb databases hardcoded at 64gb ony my smaller servers.
Typical SQL that goes into Tempdb are:
Sorts that are not solvable as part of the query (you need to store keys SOMEWHERE)
DISCTINCT. Needs all returned data in tempdb to find doubles.
Certain poerations psossibly during joins.
Tempdb usage (temporary tables). I just mention them becasue I often keep some hundred megabytes worth of data in them during loads and scrubbing.
In general you can find those queries by having hugh IO stats in the query log, or simply being slow.
That said, maintenance plans also go int there, but with reason. At the end, your "large" is possibly mine "not even worth mentioning tiny". It really depends what you do. Use the query trace tool to find out what takes long.
Physically Tempdb is very special in treatment - sql server does NOT write to the file if it does not have to (i.e. keeps thigns in memory). Writes to the disc are a sign of memory flowing ofer. This is different from normal db write behavior. Tempdb, IF it flows over, is best put onto a decently fast SSD... which wont normally be SO expensive because it still will be relatively small.
Use the query here to find other queries for tempdb - basicaly you are fishing in dirty water here, need to try out things until you find the culprit.
The usual way to grow a SQL Server database — any database, not just tempdb — is to have it's data and log files set to autogrow (especially the log files). SQL Server is perfectly happy grow the log and data files until the consume all the disk space available to them.
Best practice, IMHO, is to allow limited autogrowth on the data files (put an upper bound on how big it can grow) and fix the size of the log files. You might need to do some analysis to figure out how big the log files need to be. For tempdb, especially, the recovery model should be set to simple, too.
Ok tempdb is a kinda special database. Any temporary objects you use in procedures etc, is created here. So if you application uses a lot of temp tables in queries, they will all reside here, but they should clean themselves up after the connection (spid) is reset.
The other thing that can grow a tempdb is database maintenance tasks, however they will take a larger toll on the database log files.
Tempdb is also cleared every time you restart the SQL Service. It basically drops the database and re-create it. I agree with #Nic about leaving tempdb as it is, dont muck around with it, any issues with space in tempdb, usually indicates another larger problem somewhere else. More space will mask the problem, but only for so long. How much free space does your drive have that you have tempdb on?
Other thing, if not already, try and put tempdb on it's own drive, and one more if possible, have the data and the log files on their own separate drives.
So, if you dont restart your SQL Server/Service, your drive will run out of space pretty soon.,
use tempdb
select (size*8) as FileSizeKB from sys.database_files

Is it possible to create fast (in-memory, non-ACID, etc) tables/databases in SQL Server?

In Sqlite, there's an option to create an in-memory database, and another to not wait for things to be written to the filesystem, and to put the journal in memory or disable it. Are there any settings like this for SQL Server?
My use case is storage for data that should persist for about a day in normal use, but wouldn't be a big deal if it was lost. I would use something like memcached for it, but I want to be able to control the cache time, not just hope I have enough memory.
No.
tempdb has a bit less logging than regular databases as it doesn't have to support the "D" in acid and redo of transactions but that's about it.
Yes as of MSSQL 2014.
There is a new feature in MSSQL 2014 named In-Memory OLTP.
For a detailed feature introduction:
http://technet.microsoft.com/en-us/library/dn133186(v=sql.120).aspx
Not really. You can do something like this through SQL Server by implementing a custom solution through the SQLCLR. You can use temp tables or table variables too, but these will still write to disk. You can improve performance (by reducing blocking) but break consistency by using different ISOLATION LEVEL such as READ_UNCOMMITTED.
In brief if you really want what you ask, SQLCLR is the solution.
You could store the table on a ramdisk. That way it would always be in memory.
However, I would first try a normal table. SQL Server does a pretty good job about caching tables in memory.
Table variables:
DECLARE #name TABLE (id int identity(1,1), ...);
Table variables are kept in memory and not logged. Under memory pressure, they can spill to tempdb. However, because they are restricted to the scope of a batch execution, it would be hard (but not impossible) to store data in them for 'about a day'. I would definitely not recommend a in-memory non-acid solution based on SQL Server table variables. But, as Martin already pointed out, real tables in tempdb are a viable alternative to improve latency. You can achieve similar results on durable DBs too, with proper transaction management (batch commit) and file placement (dedicated high throughput log disk).

SQL Server 2005 Memory Pressure and tempdb writes problem

We are having some issues with our production SQL Server.
Server: Dual Quad Core Xeon
8 GB RAM
Single RAID 10 Array
Windows 2003 Server 64-bit
SQL Server 2005 Standard 64-Bit
There is about 250MB of free RAM on the machine right now. SQL Server has around 6GB of RAM, and our monitoring software says that only half of the SQL Server allocated RAM is actually being used.
Our main database is approximately 20GB, with about 12GB being used with any frequency. Our tempdb is at 700MB. Both are located on the same physical disk array.
Additionally, using Filemon, I was able to see that the tempdb file had 100's or 1000's of writes of length 65536. Disk queue length was over 100 nearly 80% of the time.
So, here are my questions-
What would cause all those writes on the tempdb? I'm not sure if we have always had that much activity, but it seems excessive and these problems are recent.
Should I just add more memory to the server?
On high load servers, should tempdb and db files be located on separate arrays?
A high disk queue length does not mean you have an I/O bottleneck if you have a SAN or NAS, you may want to look at other additional counters. Check out SQL Server Urban Legends discussed for more details.
1: The following operations heavily utilize tempdb
Repeated create and drop of temporary tables (local or global)
Table variables that use tempdb for storage purposes
Work tables associated with CURSORS
Work tables associated with an ORDER BY clause
Work tables associated with an GROUP BY clause
Work files associated with HASH PLANS
These SQL Server 2005 features also use tempdb heavily:
row level versioning (snapshotisolation)
online index re-building
As mentioned in other SO answers read this article on best practice for increasing tempdb performance.
2: Looking at the amount of free RAM on the server i.e. looking at the WMI counter Memory->Available Mbytes doesn't help as SQL Server will cache data pages in RAM, so any db server that's running long enough will have little free RAM.
The counters you should look at that are more meaningful in telling you if adding RAM to the server will help are:
SQL Server Instance:Buffer Manager->Page Life Expectancy (in seconds)
A value below 300-400 seconds will mean that Pages are not in memory very long and data continually is being read in from disks. Servers that have a low page life expectancy will benefit from additional RAM.
and
SQL Server Instance:Buffer Manager->Buffer Cache hit Ratio
This tells you the percentage of pages that were read from RAM that didn't have to incur a read from disk, a cache hit ratio lower then 85 will mean that the server will benefit from additional RAM
3: Yes, can't go wrong here. Having tempdb on a separate set of disks is recommended. Look at this KB article under the heading: Moving the tempdb database on how to do this.
Yes, the recommendation on high load servers is to put TempDB on a separate set of drives from the user databases:
SQL Server 2005 Books Online: Optimizing tempdb Performance
Not directly an answer on your question but this might be a good tip: Restarting your SQL Server instance will clear the tempdb, this might be a good start when investigating the actions which are done on the tempdb.
Excellent question, +1
tempdb is used far more heavily in SQL 2005+.
At least: Snapshot isolation levels, online index rebuild, reading INSERTED/DELETED in triggers(used to read the log file!)
This in addition to the usual order by clauses, temp tables etc.
You'd probably be better splitting your log and data files (also for recoverability).
More memory is always good but see this 64 bit specific stuff, Grumpy Old DBA below.
Finally, and maybe most important probably, you can have contention of space allocation in tempdb:
Explanations from Linchi Shea and SQL Server storage team
Late edit:
Paul Randall added an entry "Comprehensive tempdb blog post series" which offers good links
Writes to the tempdb can be anything. Internal hash tables, temp tables, table variable, stored procedure calls, etc.
If you only have 250 Megs of free RAM, then yes more RAM would be good.
It is always recommended that you split tempdb and user databases to different disks.
All writes to the tempdb will be 64k in size as that's the size of each database extent.

Best Way To Prepare A Read-Only Database

We're taking one of our production databases and creating a copy on another server for read-only purposes. The read-only database is on SQL Server 2008. Once the database is on the new server we'd like to optimize it for read-only use.
One problem is that there are large amounts of allocated space for some of the tables that are unused. Another problem I would anticipate would be fragmentation of the indexes. I'm not sure if table fragmentation is an issue.
What are the issues involved and what's the best way to go about this? Are there stored procedures included with SQL Server that will help? I've tried running DBCC SHRINKDATABASE, but that didn't deallocate the unused space.
EDIT: The exact command I used to shrink the database was
DBCC SHRINKDATABASE (dbname, 0)
GO
It ran for a couple hours. When I checked the table space using sp_spaceused, none of the unused space had been deallocated.
There are a couple of things you can do:
First -- don't worry about absolute allocated DB size unless you're running short on disk.
Second -- Idera has a lot of cool SQL Server tools, one of them defrags the DB.
http://www.idera.com/Content/Show27.aspx
Third -- dropping and re-creating the clustered index essentially defrags the tables, too -- and it re-creates all of the non-clustered indexes (defragging them as well). Note that this will probably EXPAND the allocated size of your database (again, don't worry about it) and take a long time (clustered index rebuilds are expensive).
One thing you may wish to consider is to change the recovery model of the database to simple. If you do not intend to perform any write activity to the database then you may as well benefit from automatic truncation of the transaction log, and eliminate the administrative overhead of using the other recovery models. You can always perform ad-hoc backups should you make any significant structural changes i.e. to indexes.
You may also wish to place the tables that are unused in a separate Filegroup away from the data files that will be accessed. Perhaps consider placing the unused tables on lower grade disk storage to benefit from cost savings.
Some things to consider with DBCC SHRINKDATABASE, you cannot shrink beyond the minimum size of your database.
Try issuing the statement in the following form.
DBCC SHRINKDATABASE (DBName, TRUNCATEONLY);
Cheers, John
I think it will be OK to just recreate it from the backup.
Putting tables and indexes on separate physical disks is always of help too. Indexes will be rebuilt from scratch when you recreate them on another filegroup, and therefore won't be fragmented.
There is a tool for shrinking or truncating a database in MSSQL Server. I think you select the properties of the database and you'll find it. This can be done before or after you copy the backup.
Certain forms of replication may do what you wish also.

Load readonly database tables into memory

In one of my applications I have a 1gb database table that is used for reference data. It has a huge amounts of reads coming off that table but there are no writes ever. I was wondering if there's any way that data could be loaded into RAM so that it doesn't have to be accessed from disk?
I'm using SQL Server 2005
If you have enough RAM, SQL will do an outstanding job determining what to load into RAM and what to seek on disk.
This question is asked a lot and it reminds me of people trying to manually set which "core" their process will run on -- let the OS (or in this case the DB) do what it was designed for.
If you want to verify that SQL is in fact reading your look-up data out of cache, then you can initiate a load test and use Sysinternals FileMon, Process Explorer and Process Monitor to verify that the 1GB table is not being read from. For this reason, we sometimes put our "lookup" data onto a separate filegroup so that it is very easy to monitor when it is being accessed on disk.
Hope this helps.
You're going to want to take a look at memcached. It's what a lot of huge (and well-scaled) sites used to handle problems just like this. If you have a few spare servers, you can easily set them up to keep most of your data in memory.
http://en.wikipedia.org/wiki/Memcached
http://www.danga.com/memcached/
http://www.socialtext.net/memcached/
Just to clarify the issue for the sql2005 and up:
This functionality was introduced for
performance in SQL Server version 6.5.
DBCC PINTABLE has highly unwanted
side-effects. These include the
potential to damage the buffer pool.
DBCC PINTABLE is not required and has
been removed to prevent additional
problems. The syntax for this command
still works but does not affect the
server.
DBCC PINTABLE will explicitly pin a table in core if you want to make sure it remains cached.

Resources