Need to tune the select query without adding index - sql-server

I have a query related to Indexes and Execution plan. Below are few tables that I am using
Table 1 : TestReviewResult
Column_name |Type |Length
TestReviewResultId |int |4
TestNumber |int |4
ReviewerUserId |int |4
Current Index
index_name index_description index_keys
TestReviewResult_PK clustered, unique, primary key located on PRIMARY TestReviewResultId
Table 2: TestReviewFinding
Column_name Type Length
TestReviewFindingId int 4
TestReviewResultId int 4
ScreenCode varchar 100
ReviewComments varchar 8000
CurrentIndex
index_name index_description index_keys
TestReviewFinding_PK clustered, unique, primary key located on PRIMARY TestReviewFindingId
Table3: TestReviewResultComment
Column_name Type Length
TestReviewResultCommentId Int 4
TestReviewResultId Int 4
TestReviewComment varchar 8000
CurrentIndex
index_name index_description index_keys
TestReviewResultComment_CI clustered located on PRIMARY TestReviewResultId
TestReviewResultComment_PK nonclustered, unique, primary key located on PRIMARY TestReviewResultCommentId
Table 4: TestReviewFindingElement
Column_name Type Length
TestReviewFindingElementId Int 4
TestReviewFindingId Int 4
ElementCode varchar 25
CurrentIndex
index_name index_description index_keys
TestReviewFindingElementType_PK clustered, unique, primary key located on PRIMARY TestReviewFindingElementId
Below is the query that is being used
Select columnnames…. from
From TestReviewResult(NOLOCK) cr
left outer join TestReviewResultComment(NOLOCK) crc
on cr.TestReviewResultId = crc.TestReviewResultId
left outer join TestReviewFinding(NOLOCK) cf
on cf.TestReviewResultId = cr.TestReviewResultId
left outer join TestReviewFindingElement(NOLOCK) crf
on cf.TestReviewFindingId = crf.TestReviewFindingId
where cr.TestReviewResultId = #TestReviewNumber -- Test Review number is
being passed in the stored procedure
When I ran the Execution plan it suggested to add 2 indexes mentioned below
CREATE NONCLUSTERED INDEX IX_TestReviewResultId
ON [TestReviewFinding] (TestReviewResultId)
CREATE NONCLUSTERED INDEX IX_TestReviewFindingId
ON [TestReviewFindingElement] (TestReviewFindingId)
Is there any alternative way I can tune the above query without adding indexes as there are lot of write operations that do get performed
Below is the ddl that you can execute in SQL Server and check the Query execution Plan
Create table TestReviewResult(
TestReviewResultId int NOT NULL PRIMARY KEY ,
TestNumber int ,
ReviewerUserId int )
insert into TestReviewResult values(1,1,1)
insert into TestReviewResult values(2,2,2)
insert into TestReviewResult values(3,3,3)
Create table TestReviewFinding(
TestReviewFindingId int not null primary key,
TestReviewResultId int ,
ScreenCode varchar(100))
insert into TestReviewFinding values(1,1,'A')
insert into TestReviewFinding values(2,2,'B')
insert into TestReviewFinding values(3,3,'C')
Create table TestReviewResultComment(
TestReviewResultCommentId int not null primary key nonclustered,
TestReviewResultId int ,
TestReviewComment varchar( 8000)
)
CREATE CLUSTERED INDEX [CI_TestReviewResultId]
ON TestReviewResultComment (TestReviewResultId);
insert into TestReviewResultComment values(1,1,'A')
insert into TestReviewResultComment values(2,2,'B')
insert into TestReviewResultComment values(3,3,'C')
Create table TestReviewFindingElement(
TestReviewFindingElementId int not null primary key,
TestReviewFindingId int ,
ElementCode varchar(25))
insert into TestReviewFindingElement values(1,1,'A')
insert into TestReviewFindingElement values(2,3,'B')
insert into TestReviewFindingElement values(3,3,'C')
When I m running the below query I am getting Index scans on 2 tables
TestReviewFindingElement and TestReviewFinding
Select *
From TestReviewResult(NOLOCK) cr
left outer join TestReviewResultComment(NOLOCK) crc
on cr.TestReviewResultId = crc.TestReviewResultId
left outer join TestReviewFinding(NOLOCK) cf
on cf.TestReviewResultId = cr.TestReviewResultId
left outer join TestReviewFindingElement(NOLOCK) crf
on cf.TestReviewFindingId = crf.TestReviewFindingId
where cr.TestReviewResultId = #TestReviewNumber -- Test Review number is
being passed in the stored procedure.You can use --> 1
Is there any way to modify the select query to avoid scan without adding index

Related

TSQL Reference Table with redundant keys

I'm currently working on a stored procedure on SQL Server 2016. In my Database I have a table structure and need to add another table, which references to the same table as an existing one.
Thus, I have 2 times a 1:1 relation to the same table.
The occuring problem is, I reference the same keys from 2 different origin tables twice in the same target table.
Target table:
FK_Tables | Text
----------------
1 | Table One Text Id: 1
1 | Table Two Text Id: 1 // The error: Same FK_Tables 2 times
Table One:
ID | OtherField
---------
1 | 42
Table Two:
ID | CoolField
---------
1 | 22
Table One and Table Two are currently referencing to the table Reference Table.
Do you know how I can solve this problem, of the same ID twice?
Thanks!!
You need to add a column for each table you're referencing, otherwise you wouldn't know where the ID is coming from if they were all inserted into the same field. Something like this:
/*
CREATE TEST TABLES
*/
DROP TABLE IF EXISTS tbOne;
CREATE TABLE tbOne ( ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY
, TXT VARCHAR(10)
);
DROP TABLE IF EXISTS tbTwo;
CREATE TABLE tbTwo ( ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY
, TXT VARCHAR(10)
);
DROP TABLE IF EXISTS Target;
CREATE TABLE Target ( ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY
, FKTB1 INT
, FKTB2 INT
, TXT VARCHAR(100)
);
-- 1st FK tbOne
ALTER TABLE Target ADD CONSTRAINT FK_One FOREIGN KEY (FKTB1) REFERENCES tbOne (ID);
--2nd FK tbTwo
ALTER TABLE Target ADD CONSTRAINT FK_Two FOREIGN KEY (FKTB2) REFERENCES tbTwo (ID);
-- Populate test tables
INSERT INTO tbOne (TXT)
SELECT TOP 100 LEFT(text, 10)
FROM SYS.messages
INSERT INTO tbTwo (TXT)
SELECT TOP 100 LEFT(text, 10)
FROM SYS.messages
INSERT INTO [Target] (FKTB1, FKTB2, TXT)
SELECT 1, 1, 'Test - constraint'
-- Check result set
SELECT *
FROM tbTwo
SELECT *
FROM tbOne
SELECT *
FROM [Target] T
INNER JOIN tbOne TB1
ON T.FKTB1 = TB1.ID
INNER JOIN tbTwo TB2
ON T.FKTB2 = TB2.ID

Outer SELECT from INSERT RETURNING and inner SELECT statement

I am trying to write a query that copies a row in a table to that same table, gives it a new sequential primary key, and associates it with a new foreign key. I need to associate the new primary key with another foreign key that's not inserted and exists in a different relational table (a lookup table).
I'd like to be able to do this as a single transaction, but I can't seem to find a way to associate the original row with the copied row as the unique id is new for the copied row. This is going to be a bit of a mouthful, but here's my specific question:
Can an outer SELECT clause enclose an inner INSERT with inner SELECT clause and RETURNING such that values from both the inner SELECT and the INSERT's RETURNING clause are selected and properly joined? Here's what I've attempted:
WITH batch_select AS (
SELECT id, owner_id, 1992 AS project_id
FROM batch
WHERE project_id = 1921
),
batch_insert AS (
INSERT INTO batch (owner_id, project_id)
SELECT bs.owner_id, bs.experiment_id
FROM batch_select bs
RETURNING id
)
SELECT bs.id AS origin_id, bi.id AS destination_id
FROM batch_select bs, batch_insert bi;
I need the origin_id to correspond to the destination_id. Obviously right now it's just a CROSS JOIN where everything is paired with everything and isn't very useful. I'd also be using the results of the last SELECT statement to run the INSERT into the lookup table, something like this (batch_join_select query could be implemented in the last insert, but has been left for clarity):
WITH batch_select AS (
SELECT id, owner_id, 1992 AS project_id
FROM batch
WHERE project_id = 1921
),
batch_insert AS (
INSERT INTO batch (owner_id, project_id)
SELECT bs.owner_id, bs.experiment_id
FROM batch_select bs
RETURNING id
),
batch_join_select AS (
SELECT bs.id AS origin_id, bi.id AS destination_id
FROM batch_select bs, batch_insert bi
)
INSERT INTO lookup_batch_container (batch_id, container_id)
SELECT bjs.destination_id, lbc.container_id
FROM batch_join_select bjs
INNER JOIN lookup_batch_container lbc ON lbc.batch_id = bjs.origin_id;
I found a similar question on the dba exchange, but the accepted answer doesn't correctly associate the two when there's more than one row.
Do I just have to do this with several transactions?
[EDIT] Adding some minimal schema:
Table lookup_batch_container
Column | Type | Modifiers
--------------+---------+-----------
batch_id | integer | not null
container_id | integer | not null
Indexes:
"lookup_batch_container_batch_id_container_id_key" UNIQUE CONSTRAINT, btree (batch_id, container_id)
Foreign-key constraints:
"lookup_batch_container_batch_id_fkey" FOREIGN KEY (batch_id) REFERENCES batch(id) ON DELETE CASCADE
"lookup_batch_container_container_id_fkey" FOREIGN KEY (container_id) REFERENCES container(id) ON DELETE CASCADE
Table batch
Column | Type | Modifiers
------------------+-----------------------------+------------------------------------------------------------------------------------
id | integer | not null default nextval('batch_id_seq'::regclass)
owner_id | integer | not null
project_id | integer | not null
Indexes:
"batch_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
"batch_project_id_fkey" FOREIGN KEY (project_id) REFERENCES project(id) ON DELETE CASCADE
"batch_owner_id_fkey" FOREIGN KEY (owner_id) REFERENCES owner(id) ON DELETE CASCADE
Referenced by:
TABLE "lookup_batch_container" CONSTRAINT "lookup_batch_container_batch_id_fkey" FOREIGN KEY (batch_id) REFERENCES batch(id) ON DELETE CASCADE
Table container
Column | Type | Modifiers
-----------------------+-----------------------------+------------------------------------------------------------------------------
id | integer | not null default nextval('stirplate_source_file_container_id_seq'::regclass)
owner_id | integer | not null
status | container_status_enum | not null default 'new'::container_status_enum
name | text | not null
Indexes:
"container_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
"container_owner_id_fkey" FOREIGN KEY (owner_id) REFERENCES owner(id) ON DELETE CASCADE
Referenced by:
TABLE "lookup_batch_container" CONSTRAINT "lookup_batch_container_container_id_fkey" FOREIGN KEY (container_id) REFERENCES container(id) ON DELETE CASCADE
with batch_select as (
select id, owner_id, 1992 as project_id
from batch
where project_id = 1921
), batch_insert as (
insert into batch (owner_id, project_id)
select owner_id, project_id
from batch_select
order by id
returning *
)
select unnest(oid) as oid, unnest(did) as did
from (
select
array_agg(distinct bs.id order by bs.id) as oid,
array_agg(distinct bi.id order by bi.id) as did
from
batch_select bs
inner join
batch_insert bi using (owner_id, project_id)
) s
;
oid | did
-----+-----
1 | 4
2 | 5
3 | 6
Given this batch table:
create table batch (
id serial primary key,
owner_id integer,
project_id integer
);
insert into batch (owner_id, project_id) values
(1,1921),(1,1921),(2,1921);
It could be simpler if the primary key were (id, owner_id, project_id). Isn't it?

different estimated rows on same index operation?

Introduction and Background
I had to optimize a simple query (example below). After rewriting it several times I recognized that the estimated row count on the one and same index operation differs depending on the way the query is written.
Originally the query did a clustered index scan, as the table in production contains a binary column the table is quite large (about 100 GB) and the full table scan takes too much time to execute.
Question
Why is the estimated row count different on the same index operation (example will show)? What is the optimizer doing here?
the example database - I am using SQL Server 2008 R2
I tried to create a very simplyfied version of my production tables that shows the behaviour.
-- CREATE THE SAMPLE TABLES
----------------------------
CREATE TABLE dbo.MasterTable(
MasterId smallint NOT NULL,
Name varchar(5) NOT NULL,
CONSTRAINT PK_MasterTable PRIMARY KEY CLUSTERED (MasterId ASC)
) ON [PRIMARY]
GO
CREATE TABLE dbo.DetailTable(
DetailId bigint IDENTITY(1,1) NOT NULL,
MasterId smallint NOT NULL,
Name nvarchar(50) NOT NULL,
CreateDate datetime NOT NULL,
CONSTRAINT PK_DetailTable PRIMARY KEY CLUSTERED (DetailId ASC)
) ON [PRIMARY]
GO
ALTER TABLE dbo.DetailTable
ADD CONSTRAINT FK1
FOREIGN KEY(MasterId) REFERENCES dbo.MasterTable (MasterId)
GO
CREATE NONCLUSTERED INDEX IX_DetailTable
ON dbo.DetailTable( MasterId ASC, Name ASC )
GO
-- INSERT SOME SAMPLE DATA
----------------------------
SET NOCOUNT ON
GO
-- These are some Codes. In our system we always use these codes to search for "types" of data.
INSERT INTO dbo.MasterTable (MasterId, Name)
VALUES (1, 'N1'), (2, 'N2'), (3, 'N3'), (4, 'N4'), (5, 'N5'), (6, 'N6'), (7, 'N7'), (8, 'N8')
GO
-- ADD ROWS TO THE DETAIL TABLE
-- Takes about 1 minute to run
-- Don't care about the logic, it's just to get a distribution similar to production system
----------------------------
declare #x int = 1
DECLARE #MasterID INT
while (#x <= 400000)
begin
SET #MasterID = ABS(CHECKSUM(NEWID())) % 8 + 1
INSERT INTO dbo.DetailTable(MasterId,Name,CreateDate)
VALUES(
CASE
WHEN #MasterID IN (1, 3, 4) AND #x % 20 != 0 THEN 2
WHEN #MasterID IN (5, 6) AND #x % 20 != 0 THEN 7
WHEN #MasterID = 8 AND #x % 100 != 0 THEN 7
ELSE #MasterID
END,
NEWID(),
DATEADD(DAY, - ABS(CHECKSUM(NEWID())) % 1000, GETDATE())
)
SET #x = #x + 1
end
go
-- DO THE INDEX AND STATISTIC MAINTENANCE
----------------------------
alter index all on dbo.DetailTable reorganize
alter index all on dbo.MasterTable reorganize
update statistics dbo.DetailTable WITH FULLSCAN
update statistics dbo.MasterTable WITH FULLSCAN
go
Preparation is done, let's start with the query
Let's have a look at the statistics first, look at RANGE_HI_KEY=8, there are 489 EQ_ROWS
-- CHECK THE STATISTICS
----------------------------
dbcc show_statistics ('dbo.DetailTable', IX_DetailTable)
GO
Now we do the query. The first one is the original query I had to optimize.
Please activate the current execution plan when executing.
Have a look at the operation "index seek (nonclustered) [DetailTable].[IX_DetailTable]"
-- ORIGINAL QUERY
----------------------------
SELECT d.DetailId
FROM dbo.DetailTable d
INNER JOIN dbo.MasterTable m ON d.MasterId = m.MasterId
WHERE m.Name = 'N8'
AND d.CreateDate > '20150312 11:00:00'
GO
-- FORCESEEK
----------------------------
SELECT d.DetailId
FROM dbo.DetailTable d WITH (FORCESEEK)
INNER JOIN dbo.MasterTable m ON d.MasterId = m.MasterId
WHERE m.Name = 'N8'
AND d.CreateDate > '20150312 11:00:00'
GO
-- Actual: 489, Estimated 50.000
-- TABLE VARIABLE
----------------------------
DECLARE #MasterId AS TABLE( MasterId SMALLINT )
INSERT INTO #MasterId (MasterId)
SELECT MasterID FROM dbo.MasterTable WHERE Name = 'N8'
SELECT d.DetailId
FROM dbo.DetailTable d WITH (FORCESEEK)
INNER JOIN #MasterId m ON d.MasterId = m.MasterId
WHERE d.CreateDate > '20150312 11:00:00'
GO
-- Actual: 489, Estimated 40.000
-- TEMP TABLE
----------------------------
CREATE TABLE #MasterId( MasterId SMALLINT )
INSERT INTO #MasterId (MasterId)
SELECT MasterID FROM dbo.MasterTable WHERE Name = 'N8'
SELECT d.DetailId
FROM dbo.DetailTable d --WITH (FORCESEEK)
INNER JOIN #MasterId m ON d.MasterId = m.MasterId
WHERE d.CreateDate > '20150312 11:00:00'
-- Actual 489, Estimated 489
DROP TABLE #MasterId
GO
Analyse and final question(s)
Please have a look at the operation "index seek (nonclustered) [DetailTable].[IX_DetailTable]"
The comments in the script above show you the values I got for estimated and actual row count.
In our production environment this table has 33 million rows, the estimated rows in the queries above differ from 3 million to 16 million.
To summarize:
when a join between the DetailTable and the MasterTable is made, the estimated rowcount is 12,5% (there are 8 values in the master table, it makes sense, kind of...)
when a join between the DetailTable and the table variable is made, the estimated rowcount is 10%
when a join between the DetailTable and the temp table is made, the estimated rowcount is exactly the same as the actual row count
The question is why do these values differ?
The statistics are up to date and making an estimation should really be easy.
I just would like to understand this.
As nobody answer i ll try to give answer :
Please don`t force optimizer to follow you
(1) Explanation about you original query :
SELECT d.DetailId
FROM dbo.DetailTable d
INNER JOIN dbo.MasterTable m ON d.MasterId = m.MasterId
WHERE m.Name = 'N8'
AND d.CreateDate > '20150312 11:00:00'
Why this query is slow ?
this query is slow because your indexes are not covering this query,
both query are using index scan and than joining with "Hash join":
WHY scanning entire row for mastertable ?
Because index on Master table is on column MasterId , not on column Name.
WHY scanning entire row for Detailtable? Because here as well index is on
(DETAILID) "CLUSTERED" AND ( MasterId ASC, Name ASC ) "NON CLUSTERED"
not on Createdate column.
having one NONCLUSTERED index will help this query ON column (CREATEDATE,MasterId ) for this particular Query.
If your Master table is huge as well you can create NONCLUSTERED index on (Name) column.
(2) Explanation on FORCESEEK :
-- FORCESEEK
SELECT d.DetailId
FROM dbo.DetailTable d WITH (FORCESEEK)
INNER JOIN dbo.MasterTable m ON d.MasterId = m.MasterId
WHERE m.Name = 'N8'
AND d.CreateDate > '20150312 11:00:00'
GO
Why optimizer estimated 50,000 row ?
Here you are joining on column d.MasterId = m.MasterId and you are FORCING optimizer to choose seek on Detail table, so
optizer using INDEX IX_DetailTable () to join your Mastertable using LOOP join .
Since Optimizer chooosing Loop join to join all rows (Actually ONE) of MAster table to Detail table
so it will choose one key from master table then seek for entire index and then pass the matching value to further iterator.
so optimizer chooses Average of rows per value .
8 unique values in column 40000 table cardinality (rows) so
40000 / 8 Is 50,000 rows estimated (fair enough).
(3) -- TABLE VARIABLE
Here is your query :
DECLARE #MasterId AS TABLE( MasterId SMALLINT )
INSERT INTO #MasterId (MasterId)
SELECT MasterID FROM dbo.MasterTable WHERE Name = 'N8'
SELECT d.DetailId
FROM dbo.DetailTable d WITH (FORCESEEK)
INNER JOIN #MasterId m ON d.MasterId = m.MasterId
WHERE d.CreateDate > '20150312 11:00:00'
GO
Statatictic does not maintain on table variable so optimzer has no idia how many rows( so it estimate 1 row )it gonaa deal with to produce a good plan,
here as well estimated rows are 1 and actual row 1 aswell congrates!!
but how optimizer Estimated "40.000" ROWS
Personally i never checked this and because of this question i did servels testing, but have no idia how optimzer calculating estimated rows, so it will be great if someone come up and enlight us.
(4) -- TEMP TABLE
Your Query
CREATE TABLE #MasterId( MasterId SMALLINT )
INSERT INTO #MasterId (MasterId)
SELECT MasterID FROM dbo.MasterTable WHERE Name = 'N8'
SELECT d.DetailId
FROM dbo.DetailTable d --WITH (FORCESEEK)
INNER JOIN #MasterId m ON d.MasterId = m.MasterId
WHERE d.CreateDate > '20150312 11:00:00'
-- Actual 489, Estimated 489
DROP TABLE #MasterId
here as well optimizer is choosing same query plan as was choosing in table variable but diffrence is
Statistics does maintain on temp tables, So Here in query optimizer has a fair idia what row it actually going to join.
"N8" key has 8, and 8`s estimated rows in dbo.DetailTable is 489.

TSQL to insert a set of rows and dependent rows

I have 2 tables:
Order (with a identity order id field)
OrderItems (with a foreign key to order id)
In a stored proc, I have a list of orders that I need to duplicate. Is there a good way to do this in a stored proc without a cursor?
Edit:
This is on SQL Server 2008.
A sample spec for the table might be:
CREATE TABLE Order (
OrderID INT IDENTITY(1,1),
CustomerName VARCHAR(100),
CONSTRAINT PK_Order PRIMARY KEY (OrderID)
)
CREATE TABLE OrderItem (
OrderID INT,
LineNumber INT,
Price money,
Notes VARCHAR(100),
CONSTRAINT PK_OrderItem PRIMARY KEY (OrderID, LineNumber),
CONSTRAINT FK_OrderItem_Order FOREIGN KEY (OrderID) REFERENCES Order(OrderID)
)
The stored proc is passed a customerName of 'fred', so its trying to clone all orders where CustomerName = 'fred'.
To give a more concrete example:
Fred happens to have 2 orders:
Order 1 has line numbers 1,2,3
Order 2 has line numbers 1,2,4,6.
If the next identity in the table was 123, then I would want to create:
Order 123 with lines 1,2,3
Order 124 with lines 1,2,4,6
On SQL Server 2008 you can use MERGE and the OUTPUT clause to get the mappings between the original and cloned id values from the insert into Orders then join onto that to clone the OrderItems.
DECLARE #IdMappings TABLE(
New_OrderId INT,
Old_OrderId INT)
;WITH SourceOrders AS
(
SELECT *
FROM Orders
WHERE CustomerName = 'fred'
)
MERGE Orders AS T
USING SourceOrders AS S
ON 0 = 1
WHEN NOT MATCHED THEN
INSERT (CustomerName )
VALUES (CustomerName )
OUTPUT inserted.OrderId,
S.OrderId INTO #IdMappings;
INSERT INTO OrderItems
SELECT New_OrderId,
LineNumber,
Price,
Notes
FROM OrderItems OI
JOIN #IdMappings IDM
ON IDM.Old_OrderId = OI.OrderID

Default Index on Primary Key

Does SQL Server build an index on primary keys by default?
If yes what kind of index? If no what kind of index is appropriate for selections by primary key?
I use SQL Server 2008 R2
Thank you.
You can easily determine the first part of this for yourself
create table x
(
id int primary key
)
select * from sys.indexes where object_id = object_id('x')
Gives
object_id name index_id type type_desc is_unique data_space_id ignore_dup_key is_primary_key is_unique_constraint fill_factor is_padded is_disabled is_hypothetical allow_row_locks allow_page_locks
1653580929 PK__x__6383C8BA 1 1 CLUSTERED 1 1 0 1 0 0 0 0 0 1 1
Edit: There is one other case I should have mentioned
create table t2 (id int not null, cx int)
create clustered index ixc on dbo.t2 (cx asc)
alter table dbo.t2 add constraint pk_t2 primary key (id)
select * from sys.indexes where object_id = object_id('t2')
Gives
object_id name index_id type type_desc is_unique data_space_id ignore_dup_key is_primary_key is_unique_constraint fill_factor is_padded is_disabled is_hypothetical allow_row_locks allow_page_locks has_filter filter_definition
----------- ------------------------------ ----------- ---- ------------------------------ --------- ------------- -------------- -------------- -------------------- ----------- --------- ----------- --------------- --------------- ---------------- ---------- ------------------------------
34099162 ixc 1 1 CLUSTERED 0 1 0 0 0 0 0 0 0 1 1 0 NULL
34099162 pk_t2 2 2 NONCLUSTERED 1 1 0 1 0 0 0 0 0 1 1 0 NULL
With regard to the second part there is no golden rule it depends on your individual query workload, and what your PK is.
For satisfying individual lookups by primary key a non clustered index will be fine. If you are doing queries against ranges these would be well served by a matching clustered index but a covering non clustered index could also suffice.
You also need to consider the index width of the clustered index in particular as it impacts all your non clustered indexes and effect of inserts on page splits.
I recommend the book SQL Server 2008 Query Performance Tuning Distilled to read more about the issues.
Yes. By default a unique clustered index is created on all primary keys, but you can create a unique nonclustered index instead of you like.
As to what the appropriate choice, I'd say that for 80-90% of the tables you create, you generally want the clustered index to be the primary key, but that's not always the case.
You'd typically make the clustered index something else if you do heavy range scans on that "something else". For example, if you have a synthetic primary key*, but have a date column that you typically query in terms of a range, you'd often want that date column to be the most significant column in your clustered index.
*That's usually done by using an INT IDENTITY column as the PK on the table.
Yes, it builds a clustered index on the primary key by default.
To be direct, SQL does create an index on the PRIMARY KEY (PK) keyword. That index is a Unique, Clustered Index.
sqlvogel brings up an important point in his commoent above. You can only have one "CLUSTERED" index. If you already have one prior to declaring a PK then your key will be NONCLUSTERED. This is a little more detail than the default answer to this question. It should also be noted that PK's can not have NULL values.
Note, however that that index can vary depending on prior constraints or index on the table. Additionally, you can declare the details of this index upon creation depending on how you write out the code:
< table_constraint > ::= [ CONSTRAINT constraint_name ]
{ [ { PRIMARY KEY | UNIQUE }
[ CLUSTERED | NONCLUSTERED ]
{ ( column [ ASC | DESC ] [ ,...n ] ) }
[ WITH FILLFACTOR = fillfactor ]
[ ON { filegroup | DEFAULT } ]
]
Example:
CREATE TABLE MyTable
(
Id INT NOT NULL,
ForeignKeyId INT REFERENCES OtherTable(Id) NOT NULL,
Name VARCHAR(50) NOT NULL,
Comments VARCHAR(500) NULL,
PRIMARY KEY NONCLUSTERED (Id, ForeignKeyId)
)
No it doesn't in the case where there is already an index defined on the table
I just come across the same confusion ,Just created a following script to test :
create database mytest
go
use mytest
go
create table x
(
id int primary key
)
go
create table y
(
id int
)
go
select * from sys.indexes where object_id = object_id(N'x') or object_id=object_Id(N'y')
go
First Table x having primary key get's the clustered index ,while table y did't get any
indexes ,as there is no primary key .
Confirmed following point about clustered Indexes :
When we create table with Primary Key Clustered Index will be created
by Default
When we create table without Primary key clustered index will not be
created .
There will be single clustered Index ,as it depends on the Index key
Value . Shorting of the data rows in the table can only be in the one
order based on Index key value .

Resources