Can I have a primary key and a separate clustered index together? - database

Let's assume I already have a primary key, which makes sure uniqueness. My primary key is also ordering index for the records. However, I am curious about the primary key's task in physical order of records in the disk (if there is). And the actual question is can I have a separate clustered index for these records?

This is an attempt at testing the size and performance characteristics of a covering secondary index on a clustered table, as per discussion with #Catcall.
All tests were done on MS SQL Server 2008 R2 Express (inside a fairly underpowered VM).
Size
First, I crated a clustered table with a secondary index and filled it with some test data:
CREATE TABLE THE_TABLE (
FIELD1 int,
FIELD2 int NOT NULL,
CONSTRAINT THE_TABLE_PK PRIMARY KEY (FIELD1)
);
CREATE INDEX THE_TABLE_IE1 ON THE_TABLE (FIELD2) INCLUDE (FIELD1);
DECLARE #COUNT int = 1;
WHILE #COUNT <= 1000000 BEGIN
INSERT INTO THE_TABLE (FIELD1, FIELD2) VALUES (#COUNT, #COUNT);
SET #COUNT = #COUNT + 1;
END;
EXEC sp_spaceused 'THE_TABLE';
The last line gave me the following result...
name rows reserved data index_size unused
THE_TABLE 1000000 27856 KB 16808 KB 11008 KB 40 KB
So, the index's B-Tree (11008 KB) is actually smaller than the table's B-Tree (16808 KB).
Speed
I generated a random number within the range of the data in the table, and then used it as criteria for selecting a whole row from the table. This was repeated 10000 times and the total time measured:
DECLARE #I int = 1;
DECLARE #F1 int;
DECLARE #F2 int;
DECLARE #END_TIME DATETIME2;
DECLARE #START_TIME DATETIME2 = SYSDATETIME();
WHILE #I <= 10000 BEGIN
SELECT #F1 = FIELD1, #F2 = FIELD2
FROM THE_TABLE
WHERE FIELD1 = (SELECT CEILING(RAND() * 1000000));
SET #I = #I + 1;
END;
SET #END_TIME = SYSDATETIME();
SELECT DATEDIFF(millisecond, #START_TIME, #END_TIME);
The last line produces an average time (of 10 measurements) of 181.3 ms.
When I change the query condition to: WHERE FIELD2 = ..., so the secondary index is used, the average time is 195.2 ms.
Execution plans:
So the performance (of selecting on the PK versus on the covering secondary index) seems to be similar. For much larger amounts of data, I suspect the secondary index could possibly be slightly faster (since it seems more compact and therefore cache-friendly), but I didn't hit that yet in my testing.
String Measurements
Using varchar(50) as type for FIELD1 and FIELD2 and inserting strings that vary in length between 22 and 28 characters gave similar results.
The sizes were:
name rows reserved data index_size unused
THE_TABLE 1000000 208144 KB 112424 KB 95632 KB 88 KB
And the average timings were: 254.7 ms for searching on FIELD1 and 296.9 ms fir FIELD2.
Conclusion
If a clustered table has a covering secondary index, that index will have space and time characteristics similar to the table itself (possibly slightly slower, but not by much). If effect, you'll have two B-Trees that sort their data differently, but are otherwise very similar, achieving your goal of having a "second cluster".

It depends on your dbms. Not all of them implement clustered indexes. Those that do are liable to implement them in different ways. As far as I know, every platform that implements clustered indexes also provides ways to choose which columns are in the clustered index, although often the primary key is the default.
In SQL Server, you can create a nonclustered primary key and a separate clustered index like this.
create table test (
test_id integer primary key nonclustered,
another_column char(5) not null unique clustered
);
I think that the closest thing to this in Oracle is an index organized table. I could be wrong. It's not quite the same as creating a table with a clustered index in SQL Server.
You can't have multiple clustered indexes on a single table in SQL Server. A table's rows can only be stored in one order at a time. Actually, I suppose you could store rows in multiple, distinct orders, but you'd have to essentially duplicate all or part of the table for each order. (Although I didn't know it at the time I wrote this answer, DB2 UDB supports multiple clustered indexes, and it's quite an old feature. Its design and implementation is quite different from SQL Server.)
A primary key's job is to guarantee uniqueness. Although that job is often done by creating a unique index on the primary key column(s), strictly speaking uniqueness and indexing are two different things with two different aims. Uniqueness aims for data integrity; indexing aims for speed.
A primary key declaration isn't intended to give you any information about the order of rows on disk. In practice, it usually gives you some information about the order of index entries on disk. (Because primary keys are usually implemented using a unique index.)
If you SELECT rows from a table that has a clustered index, you still can't be assured that the rows will be returned to the user in the same order that they're stored on disk. Loosely speaking, the clustered index helps the query optimizer find rows faster, but it doesn't control the order in which those rows are returned to the user. The only way to guarantee the order in which rows are returned to the user is with an explicit ORDER BY clause. (This seems to be a fairly frequent point of confusion. A lot of people seem surprised when a bare SELECT on a clustered index doesn't return rows in the order they expect.)

Related

Will a table with an int primary key outperform its uid equivalent?

I am working with a legacy Sql Server database which uses UNIQUEIDENTIFIER and am considering performance. If I have two tables, idencial except for the Identity column, something like this:
CREATE TABLE [uidExampleTable] (
[exampleUid] UNIQUEIDENTIFIER CONSTRAINT [DF_uidExampleTable_uid] DEFAULT (newid()) NOT NULL,
[name] VARCHAR (50) NOT NULL,
[createdDate] DATETIME NOT NULL,
CONSTRAINT [PK_uidExampleTable] PRIMARY KEY CLUSTERED ([exampleUid] ASC));
CREATE TABLE [intExampleTable] (
[exampleIntId] INT IDENTITY (1, 1) NOT NULL,
[name] VARCHAR (50) NOT NULL,
[createdDate] DATETIME NOT NULL,
CONSTRAINT [PK_intExampleTable] PRIMARY KEY CLUSTERED ([[exampleIntId] ASC));
And I fill these tables with, say, ten million rows each, then perform a select on each:
Select top 20 * from uidExampleTable order by createdDate desc
Select top 20 * from intExampleTable order by createdDate desc
Would you expect the second query on intExampleTable to return results more quickly?
Both tables have an index. Whether or not there is an index on the table is determined by the PRIMARY KEY directive, rather than the type of the key field.
However, these indexes won't help those queries for either table.
There are still some performance differences, though. The UNIQUEIDENTIFIER (hereafter UID, because I'm lazy) adds an extra 12 bytes for each row. Assuming the average name length is 10 characters out of a possible 50, that should work out to 38 bytes per row* on average for the int table and 50 bytes per row on average for the UID table, which is more than a 30% increase in row size.
So yes, that can make a difference over 10 million records. Keep in mind, though, for many tables you'll have a lot more data in the table, and the relative difference starts to diminish as the width of the table increases.
The other place you'll have a performance difference is INSERT statements. With an IDENTITY column, an INSERT is naturally already in primary key order and new records are simply appended to the end of the last page (or the beginning of a new page, if the last page was full). A UID, though, is more random, where you usually need to insert into the middle of a page somewhere. You can offset this a bit by changing the FILL FACTOR for your index, but that comes at the cost of needing more pages. This is one reason we also have sequential UIDs.
Even so, these differences tend to be small compared to other factors. Sometimes they can be important, but you generally need to measure your system's performance first to know for sure.
For example, for this query, rather than worrying about the UID vs INT for the key, you can really improve things by adding a descending index for the createdDate column. Definitely if you know you could have more than 4 billion rows, or it could be dangerous if people could guess an ID to get a valid record, don't let a little bit of performance out weigh those concerns.
* 14 bytes row overhead + 4 bytes int ID + 2 bytes varchar overhead + 10 bytes varchar data + 8 bytes datetime = 38 bytes total
Yes it will, except when you have 1,000’s of inserts per second and your storage cannot handle it, then you will get contention for the writes.

Find distinct values in SQL table without scan

A SQL Server table with >200 million records is divided into ~100 partitions (not true SQL Server Partitions - it's not running on a compatible edition of SQL Server) by adding a column PartitionID. PartitionID is the first half the table's clustered index definition; the other half is a unique auto-incrementing integer ID. PartitionID is also foreign key into the Partition table. No record from Example is ever accessed without knowing its PartitionID; they are usually accessed in ranges associated with a single PartitionID (or small number of PartitionIDs).
CREATE TABLE Example (
ID BIGINT IDENTITY(1, 1) NOT NULL,
PartitionID DECIMAL(18, 0) NOT NULL,
-- Other columns omitted for brevity
CONSTRAINT PK_Example PRIMARY KEY NONCLUSTERED (ID),
CONSTRAINT FK_Example_Partition FOREIGN KEY (PartitionID) REFERENCES Partition (ID)
)
CREATE UNIQUE CLUSTERED INDEX IX_Example ON Example(PartitionID, ID)
Partition rows are kept indefinitely, but Example rows are frequently purged by issuing a DELETE statement against a range with the same PartitionID. Over time, this leads to Partition rows that are not referenced by any Example rows. This is not the problem; the problem is identifying the Partition rows that are still referenced.
Without resorting to user-level management techniques like adding and manually maintaining a ReferenceCount field in the Partition table, or adding and manually maintaining a list of in-use PartitionIDs, is there a system-level technique we could use to discover the set of PartitionIDs that are still in use - without scanning all the rows in table Example?
SELECT DISTINCT PartitionID FROM Example
The above query takes tens of seconds to return 100 values because it's scanning 100s of millions of rows in the clustered index. Adding another very narrow index on PartionID alone might reduce the I/O and halve the time but essentially SQL Server is still scanning that index too.
CREATE NONCLUSTERED INDEX IX_Example_PartitionID ON Example(PartitionID)
I should probably also avoid joining Partition with Example (performing a number of clustered index seeks instead of an index scan) because the number of seeks will increase (and decrease performance) over time.
SELECT DISTINCT PartitionID FROM Partition p WHERE EXISTS (
SELECT TOP 1 1 FROM Example e WHERE p.ID = e.PartitionID
)

Why does SQL Server use an Index Scan instead of a Seek + RID lookup?

I have a table with approx. 135M rows:
CREATE TABLE [LargeTable]
(
[ID] UNIQUEIDENTIFIER NOT NULL,
[ChildID] UNIQUEIDENTIFIER NOT NULL,
[ChildType] INT NOT NULL
)
It has a non-clustered index with no included columns:
CREATE NONCLUSTERED INDEX [LargeTable_ChildID_IX]
ON [LargeTable]
(
[ChildID] ASC
)
(It is clustered on ID).
I wish to join this against a temporary table which contains a few thousand rows:
CREATE TABLE #temp
(
ChildID UNIQUEIDENTIFIER PRIMARY KEY,
ChildType INT
)
...add #temp data...
SELECT lt.ChildID, lt.ChildType
FROM #temp t
INNER JOIN [LargeTable] lt
ON lt.[ChildID] = t.[ChildID]
However the query plan includes an index scan on the large table:
If I change the index to include extra columns:
CREATE NONCLUSTERED INDEX [LargeTable_ChildID_IX] ON [LargeTable]
(
[ChildID] ASC
)
INCLUDE [ChildType]
Then the query plan changes to something more sensible:
So my question is: Why can't SQL Server still use an index seek in the first scenario, but with a RID lookup to get from the non-clustered index to the table data? Surely that would be more efficient than an index scan on such a large table?
The first query plan actually makes a lot of sense. Remember that SQL Server never reads records, it reads pages. In your table, a page contains many records, since those records are so small.
With the original index, if the second query plan would be used, after finding all the RID's in the index, and reading index pages to do so, pages in the clustered index need to be read to read the ChildType column. In a worst case scenario, that is an entire page for each record it needs to read. As there are many records per page, that might boil down to reading a large percentage of the pages in the clustered index.
SQL server guessed, based on statistics, that simply scanning the pages in the clustered index would require less page reads in total, because it then avoids reading the pages in the non-clustered index.
What matters here is the number of rows in the temp table compared to the number of pages in the large table. Assuming a random distribution of ChildID in the large table, as soon as the number of rows in the temp table approaches or supersedes the number of pages in the large table, SQL server will have to read virtually every page in the large table anyway.
Because the column ChildType isn't covered in an index, it has to go back to the clustered index (with the mentioned Row IDentifier lookup) to get the values for ChildType.
When you INCLUDE this column in the nonclustered index it will be added to the leaf-level of the index where it is available for querying.
Colloquially is called 'the index tipping point'. Basically, at what point does the cost based optimizer consider that is more effective to do a scan rather than seek + lookup. Usually is around 20% of the size, which in your case will base on an estimate coming from the #temp table stats. YMMV.
You already have your answer: include the required column, make the index covering.

Querying minimum value in SQL Server is a lot longer than querying all the rows

I'm currently confronted with a strange behaviour in my database when I'm querying a minimum ID for a specific date in a table contains about a hundred million rows. The query is quite simple :
SELECT MIN(Id) FROM Connection WITH(NOLOCK) WHERE DateConnection = '2012-06-26'
This query nevers end, at least I let it run for hours. The DateConnection column is not an index neither included in one. So I would understand that this query can last quite a bit. But I tried the following query which runs in few seconds :
SELECT Id FROM Connection WITH(NOLOCK) WHERE DateConnection = '2012-06-26'
It returns 300k rows.
My table is defined as this :
CREATE TABLE [dbo].[Connection](
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[DateConnection] [datetime] NOT NULL,
[TimeConnection] [time](7) NOT NULL,
[Hour] AS (datepart(hour,[TimeConnection])) PERSISTED NOT NULL,
CONSTRAINT [PK_Connection] PRIMARY KEY CLUSTERED
(
[Hour] ASC,
[Id] ASC
)
)
And it has the following index :
CREATE UNIQUE NONCLUSTERED INDEX [IX_Connection_Id] ON [dbo].[Connection]
(
[Id] ASC
)ON [PRIMARY]
One solutions I find using this strange behaviour is using the following code. But it seems to me quite a bit heavy for such a simple query.
create table #TempId
(
[Id] bigint
)
go
insert into #TempId
select id from partitionned_connection with(nolock) where dateconnection = '2012-06-26'
declare #displayId bigint
select #displayId = min(Id) from #CoIdTest
print #displayId
go
drop table #TempId
go
Has anybody been confronted to this behaviour and what is the cause of it ? Is the minimum aggregate scanning the entire table ? And if this is the case why the simple select does not ?
The root cause of the problem is the non-aligned nonclustered index, combined with the statistical limitation Martin Smith points out (see his answer to another question for details).
Your table is partitioned on [Hour] along these lines:
CREATE PARTITION FUNCTION PF (integer)
AS RANGE RIGHT
FOR VALUES (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23);
CREATE PARTITION SCHEME PS
AS PARTITION PF ALL TO ([PRIMARY]);
-- Partitioned
CREATE TABLE dbo.Connection
(
Id bigint IDENTITY(1,1) NOT NULL,
DateConnection datetime NOT NULL,
TimeConnection time(7) NOT NULL,
[Hour] AS (DATEPART(HOUR, TimeConnection)) PERSISTED NOT NULL,
CONSTRAINT [PK_Connection]
PRIMARY KEY CLUSTERED
(
[Hour] ASC,
[Id] ASC
)
ON PS ([Hour])
);
-- Not partitioned
CREATE UNIQUE NONCLUSTERED INDEX [IX_Connection_Id]
ON dbo.Connection
(
Id ASC
)ON [PRIMARY];
-- Pretend there are lots of rows
UPDATE STATISTICS dbo.Connection WITH ROWCOUNT = 200000000, PAGECOUNT = 4000000;
The query and execution plan are:
SELECT
MinID = MIN(c.Id)
FROM dbo.Connection AS c WITH (READUNCOMMITTED)
WHERE
c.DateConnection = '2012-06-26';
The optimizer takes advantage of the index (ordered on Id) to transform the MIN aggregate to a TOP (1) - since the minimum value will by definition be the first value encountered in the ordered stream. (If the nonclustered index were also partitioned, the optimizer would not choose this strategy since the required ordering would be lost).
The slight complication is that we also need to apply the predicate in the WHERE clause, which requires a lookup to the base table to fetch the DateConnection value. The statistical limitation Martin mentions explains why the optimizer estimates it will only need to check 119 rows from the ordered index before finding one with a DateConnection value that will match the WHERE clause. The hidden correlation between DateConnection and Id values means this estimate is a very long way off.
In case you are interested, the Compute Scalar calculates which partition to perform the Key Lookup into. For each row from the nonclustered index, it computes an expression like [PtnId1000] = Scalar Operator(RangePartitionNew([dbo].[Connection].[Hour] as [c].[Hour],(1),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23))), and this is used as the leading key of the lookup seek. There is prefetching (read-ahead) on the nested loops join, but this needs to be an ordered prefetch to preserve the sorting required by the TOP (1) optimization.
Solution
We can avoid the statistical limitation (without using query hints) by finding the minimum Id for each Hour value, and then taking the minimum of the per-hour minimums:
-- Global minimum
SELECT
MinID = MIN(PerHour.MinId)
FROM
(
-- Local minimums (for each distinct hour value)
SELECT
MinID = MIN(c.Id)
FROM dbo.Connection AS c WITH(READUNCOMMITTED)
WHERE
c.DateConnection = '2012-06-26'
GROUP BY
c.[Hour]
) AS PerHour;
The execution plan is:
If parallelism is enabled, you will see a plan more like the following, which uses parallel index scan and multi-threaded stream aggregates to produce the result even faster:
Although it might be wise to fix the problem in a way that doesn't require index hints, a quick solution is this:
SELECT MIN(Id) FROM Connection WITH(NOLOCK, INDEX(PK_Connection)) WHERE DateConnection = '2012-06-26'
This forces a table scan.
Alternatively, try this although it probably produces the same problem:
select top 1 Id
from Connection
WHERE DateConnection = '2012-06-26'
order by Id
It makes sense that finding the minimum takes longer than going through all the records. Finding the minimum of an unsorted structure takes much longer than traversing it once (unsorted because MIN() doesn't take advantage of the identity column). What you could do, since you're using an identity column, is have a nested select, where you take the first record from the set of records with the specified date.
The NC index scan is issue in you case.It is using the unique non clustered index scan and then for each row that is hundred million rows it will traverse the clustered index and thus it causes millions of io's(usually say your index hieght is 4 then it might cause 100million*4 IO's +index scan of the nonclustered index leaf page).Optimizer must have chosen this index to avoid the strem aggregate to get the minimum.To find minimum there are 3 main technique,one is using index on the column for which we want min (it is efficient if there is index and in that case no calc required as soon as you get the row it is returned),2nd it could use hash aggregate (but it usually happens when you have group by) and 3rd is stream aggregate here it will scan through all the rows which are qualified and keep the min value always and return min when all rows are scanned..
Howvere, when the query without min used the clustered index scan and thus is fast as it has to read less number of page and thus less io's.
Now question is why optimizer picked up the index scan on non clustered index.I am sure it is to avoid the compuation involved in stream aggregate to find the min value using stream aggregate but in thise case not using the stream aggregate is much more costly. This depends on estimation so i guess stats are not up to date in the table.
So fist of all check whether your stats are upto date.When was the stats were updated last?
Thus to avoid the issue.Do following
1. First update the table stats and I am sure it must remove your issue.
2. In case, you can not use update stats or update stats doesnt change the plan and still uses the NC index scan then you can force the clustered index scan so that it uses less IO's followed by stream aggregate to get min value.

Big Table Advice (SQL Server)

I'm experiencing massive slowness when accessing one of my tables and I need some re-factoring advice. Sorry if this is not the correct area for this sort of thing.
I'm working on a project that aims to report on server performance statistics for our internal servers. I'm processing windows performance logs every night (12 servers, 10 performance counters and logging every 15 seconds). I'm storing the data in a table as follows:
CREATE TABLE [dbo].[log](
[id] [int] IDENTITY(1,1) NOT NULL,
[logfile_id] [int] NOT NULL,
[test_id] [int] NOT NULL,
[timestamp] [datetime] NOT NULL,
[value] [float] NOT NULL,
CONSTRAINT [PK_log] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH FILLFACTOR = 90 ON [PRIMARY]
) ON [PRIMARY]
There's currently 16,529,131 rows and it will keep on growing.
I access the data to produce reports and create graphs from coldfusion like so:
SET NOCOUNT ON
CREATE TABLE ##RowNumber ( RowNumber int IDENTITY (1, 1), log_id char(9) )
INSERT ##RowNumber (log_id)
SELECT l.id
FROM log l, logfile lf
WHERE lf.server_id = #arguments.server_id#
and l.test_id = #arguments.test_id#"
and l.timestamp >= #arguments.report_from#
and l.timestamp < #arguments.report_to#
and l.logfile_id = lf.id
order by l.timestamp asc
select rn.RowNumber, l.value, l.timestamp
from log l, logfile lf, ##RowNumber rn
where lf.server_id = #arguments.server_id#
and l.test_id = #arguments.test_id#
and l.logfile_id = lf.id
and rn.log_id = l.id
and ((rn.rownumber % #modu# = 0) or (rn.rownumber = 1))
order by l.timestamp asc
DROP TABLE ##RowNumber
SET NOCOUNT OFF
(for not CF devs #value# inserts value and ## maps to #)
I basically create a temporary table so that I can use the rownumber to select every x rows. In this way I'm only selecting the amount of rows I can display. this helps but it's still very slow.
SQL Server Management Studio tells me my index's are as follows (I have pretty much no knowledge about using index's properly):
IX_logfile_id (Non-Unique, Non-Clustered)
IX_test_id (Non-Unique, Non-Clustered)
IX_timestamp (Non-Unique, Non-Clustered)
PK_log (Clustered)
I would be very grateful to anyone who could give some advice that could help me speed things up a bit. I don't mind re-organising things and I have complete control of the project (perhaps not over the server hardware though).
Cheers (sorry for the long post)
Your problem is that you chose a bad clustered key. Nobody is ever interested in retrieving one particular log value by ID. I your system is like anything else I've seen, then all queries are going to ask for:
all counters for all servers over a range of dates
specific counter values over all servers for a range of dates
all counters for one server over a range of dates
specific counter for specific server over a range of dates
Given the size of the table, all your non-clustered indexes are useless. They are all going to hit the index tipping point, guaranteed, so they might just as well not exists. I assume all your non-clustered index are defined as a simple index over the field in the name, with no include fields.
I'm going to pretend I actually know your requirements. You must forget common sense about storage and actually duplicate all your data in every non-clustered index. Here is my advice:
Drop the clustered index on [id], is a as useless as is it gets.
Organize the table with a clustered index (logfile_it, test_id, timestamp).
Non-clusterd index on (test_id, logfile_id, timestamp) include (value)
NC index on (logfile_id, timestamp) include (value)
NC index on (test_id, timestamp) include (value)
NC index on (timestamp) include (value)
Add maintenance tasks to reorganize all indexes periodically as they are prone to fragmentation
The clustered index covers the query 'history of specific counter value at a specific machine'. The non clustered indexes cover various other possible queries (all counters at a machine over time, specific counter across all machines over time etc).
You notice I did not comment anything about your query script. That is because there isn't anything in the world you can do to make the queries run faster over the table structure you have.
Now one thing you shouldn't do is actually implement my advice. I said I'm going to pretend I know your requirements. But I actually don't. I just gave an example of a possible structure. What you really should do is study the topic and figure out the correct index structure for your requirements:
General Index Design Guidelines.
Index Design Basics
Index with Included Columns
Query Types and Indexes
Also a google on 'covering index' will bring up a lot of good articles.
And of course, at the end of the day storage is not free so you'll have to balance the requirement to have a non-clustered index on every possible combination with the need to keep the size of the database in check. Luckly you have a very small and narrow table, so duplicating it over many non-clustered index is no big deal. Also I wouldn't be concerned about insert performance, 120 counters at 15 seconds each means 8-9 inserts per second, which is nothing.
A couple things come to mind.
Do you need to keep that much data? If not, consider either creating an archive table if you want to keep it (but don't create it just to join it with the primary table every time you run a query).
I would avoid using a temp table with so much data. See this article on temp table performance and how to avoid using them.
http://www.sql-server-performance.com/articles/per/derived_temp_tables_p1.aspx
It looks like you are missing an index on the server_id field. I would consider creating a covered index using this field and others. Here is an article on that as well.
http://www.sql-server-performance.com/tips/covering_indexes_p1.aspx
Edit
With that many rows in the table over such a short time frame, I would also check the indexes for fragmentation which may be a cause for slowness. In SQL Server 2000 you can use the DBCC SHOWCONTIG command.
See this link for info http://technet.microsoft.com/en-us/library/cc966523.aspx
Also, please note that I have numbered these items as 1,2,3,4 however the editor is automatically resetting them
Once when still working with sql server 2000, i needed to do some paging, and i came accross a method of paging that realy blew my mind. Have a look at this method.
DECLARE #Table TABLE(
TimeVal DATETIME
)
DECLARE #StartVal INT
DECLARE #EndVal INT
SELECT #StartVal = 51, #EndVal = 100
SELECT *
FROM (
SELECT TOP (#EndVal - #StartVal + 1)
*
FROM (
--select up to end number
SELECT TOP (#EndVal)
*
FROM #Table
ORDER BY TimeVal ASC
) PageReversed
ORDER BY TimeVal DESC
) PageVals
ORDER BY TimeVal ASC
As an example
SELECT *
FROM (
SELECT TOP (#EndVal - #StartVal + 1)
*
FROM (
SELECT TOP (#EndVal)
l.id,
l.timestamp
FROM log l, logfile lf
WHERE lf.server_id = #arguments.server_id#
and l.test_id = #arguments.test_id#"
and l.timestamp >= #arguments.report_from#
and l.timestamp < #arguments.report_to#
and l.logfile_id = lf.id
order by l.timestamp asc
) PageReversed ORDER BY timestamp DESC
) PageVals
ORDER BY timestamp ASC

Resources