Is searching a primary key column in a table faster than searching a non primary key column due to the default clustered index on primary keys in SQL Server?
from a previous question; "The reason we specify keys for a table is primarily to improve the data integrity and usefulness of the data. Keys guarantee the table is free from duplicate data and therefore they allow the user/consumer of the data to identify information correctly. DBMS query optimizers and storage engines are designed to take advantage of keys so having a key will also give your DBMS the best chance of executing some queries efficiently but there's no guarantee that adding a key will improve performance in every case"
That being said the existence of an Index should make searching faster in most cases, but is unrelated to the key per-se
Searching according to an index is generally faster than searching without an index (unless your table is extremely small, or the index is in a terrible condition). The fact that this index also supports a primary key is inconsequential to this discussion.
In general a relational database will be optimised to use primary keys efficiently, however your database structure, your query and which database engine on what platform will all be major factors in the performance.
Additionally, if the non-primary key column is not indexed, it will be orders of magnitude slower, regardless of your query.
Is searching a primary key column in a table faster than searching a
non primary key column due to the default clustered index on primary
keys in SQL Server?
Since the optimizer doesn't care whether an index supports a constraint or not, I'll paraphrase the question as:
Is searching using the clustered index key column(s) faster than searching via a
non-clustered index key column(s)?
A clustered index seek a disk-based table is the fastest method to return the entire row 1) for singleton lookups and 2) for key range searches. This minimizes the number of logical reads need to locate and retrieve rows.
SQL Server uses clustered as the default for primary key indexes to leverage this performance benefit and encourage the practice that all tables should have a clustered index. Also, since the clustered index key is implicitly included in all non-clustered indexes as the row locator, the likelihood that non-clustered indexes cover queries is increased.
This is not to say that the primary key should always be the clustered index. There may be a better choice when queries most often use other columns in join/where clause predicates.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I have some doubts on choosing the right index and have some questions:
Clustered index
What is the best candidate?
Usually is the primary key but if the primary key is not used in the search by eg CustomerNo is used to search on customers should the clustered index put on CustomerNo?
Views with SchemaBinding
If have a view with indexes I read that these are not used but those on tables are.
Pointless no? Or am I missing the point? Will it make a difference using "NOExpand" to force to read the index from the view rather than the table?
Nonclustered indexes
Is it good practice when adding a nonclustered index to include every possible column till you reach the limit?
Many thanks for your time. I am reading massive database and speed is a must
The clustered index is the index that (a) defines the storage layout of your table (the table data is physically sorted by the clustering key), and (b) is used as the "row locator" in every single nonclustered index on that table.
Therefore, the clustered index should be
narrow (4 byte is ideal, 8 byte OK - anything else is too much)
unique (if you don't use a unique clustered index, SQL Server will add a 4 byte uniqueifier to your table)
static (shouldn't change)
optimally it should be ever-increasing
fixed with - e.g. don't use large Varchar(x) columns in your clustered index
Out of these requirements, the INT IDENTITY seems to be the most logical, most obvious choice. Don't use variable length columns, don't use multiple columns (if ever possible), don't use GUID (that's a horribly bad choice because of it's size and randomness)
For more background info on clustering keys and clustered indexes - read everything that Kimberly Tripp ever publishes! She's the Queen of Indexing in SQL Server - she knows her stuff extremely well!
See e.g. these blog posts:
GUIDs as PRIMARY KEYs and/or the clustering key
The Clustered Index Debate Continues...
Ever-increasing clustering key - the Clustered Index Debate..........again!
Disk space is cheap - that's not the point!
In general: don't overindex! too many indices is often worse than none!
For non-clustered indexes: I would typically index foreign key columns - those indexes help with JOINs and other operations and make things faster.
Other than that: don't put too many indexes in your database ! Every index must be maintained on every CRUD operation on your table! This is overhead - don't excessively index!
An index with all columns of a table is an especially bad idea since it really cannot be used for much - but carries a lot of administrative overhead.
Run your app, profile it - see which operations are slow, try to optimize those by adding a few selective indexes to your table.
Clustered Indexes
Just to add to marc_s good answer, one exception to the standard INT IDENTITY PK approach to Clustered Indexes is when you have Parent Child tables, where all the children are frequently always retrieved at the same time as the parent. In this case, clustering by Child table by the Parent PK will reduce the number of pages read when the children are retrieved. For example:
CREATE TABLE Invoice
(
-- Use the default MS Approach on the parent, viz Clustered by Surrogate PK
InvoiceID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
-- Index Fields here
);
CREATE TABLE InvoiceLineItem
(
-- Own Surrogate Key
InvoiceLineItemID INT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED,
InvoiceID INT NOT NULL FOREIGN KEY REFERENCES Invoice(InvoiceID),
-- Line Item Fields Here
);
-- But Cluster on the Parent FK
CREATE CLUSTERED INDEX CL_InvoiceLineItem ON InvoiceLineItem(InvoiceID);
NonClustered Indexes
No, never just include columns without careful thought - the index tree needs to be as narrow as possible. The ordering of the index columns is critical, and always ensure that the index is designed with selectivity of the data in mind - you will need to have a good understanding of the distribution of your data in order to choose optimal indexes.
You can consider using covering indexes to include (at most, a few) columns which would otherwise have required a bookmark lookup from the Nonclustered index back into the table when tuning performance-critical queries.
As a very basic rule of thumb I use, is to use nonclustered indexes when small amounts of data will be returned and clustered indexes when larger resultsets will be returned by your query.
I recomend you read Clustered Index Design Guidelines
As for indexing views: indexing views works the same as indexing the table. It can improve preformance but like indexing tables it can also slow things down.
I recomend you read Improving Performance with SQL Server 2008 Indexed Views
In genral when indexing i find less is better. You need to research your data not just slap indexes on everthing. Check what you are linking on, add indexes and check the Execution plan. Sometimes what you think would make a good index actualy can make thing slower.
Views with SchemaBinding
...
Pointless no? Or am I missing the point?
(More properly, indexed views, schemabinding is a means to an end here, and the rest of the text is more talking about indexed views)
There can be (at least) two reasons for creating an indexed view. Without seeing your database, it's impossible to tell which of those reasons apply.
The first is to compute intermediate results which are expensive to compute from the base table. In order to benefit from that computation, you need to ensure your query uses the indexes. To use the indexes you either need to be querying the view and specifying NOEXPAND, or be using Enterprise or Developer edition (On Ent/Dev editions the index might be used even if the base table is queried and the view isn't mentioned)
The second reason is to enforce a constraint that isn't enforceable in a simpler manner, by implementing e.g. a unique constraint on the view, this may be enforcing some form of conditional uniqueness on the base table.
An example of the second - say you want table T to be able to contain multiple rows with the same U value - but of those rows, only one may be marked as the Default. Before filtered indexes were available, this was commonly achieved as:
CREATE VIEW DRI_T_OneDefault
WITH SCHEMABINDING
AS
SELECT U
FROM S.T
WHERE Default = 1
GO
CREATE UNIQUE CLUSTERED INDEX IX_DRI_T_OneDefault on DRI_T_OneDefault (U)
The point is that these indexes enforce a constraint. It doesn't matter (in such a case) whether any query every actually uses the index. In the same way that any unique constraint may be declared on a base table but never actually used in any queries.
Recently I found a couple of tables in a Database with no Clustered Indexes defined.
But there are non-clustered indexes defined, so they are on HEAP.
On analysis I found that select statements were using filter on the columns defined in non-clustered indexes.
Not having a clustered index on these tables affect performance?
It's hard to state this more succinctly than SQL Server MVP Brad McGehee:
As a rule of thumb, every table should have a clustered index. Generally, but not always, the clustered index should be on a column that monotonically increases–such as an identity column, or some other column where the value is increasing–and is unique. In many cases, the primary key is the ideal column for a clustered index.
BOL echoes this sentiment:
With few exceptions, every table should have a clustered index.
The reasons for doing this are many and are primarily based upon the fact that a clustered index physically orders your data in storage.
If your clustered index is on a single column monotonically increases, inserts occur in order on your storage device and page splits will not happen.
Clustered indexes are efficient for finding a specific row when the indexed value is unique, such as the common pattern of selecting a row based upon the primary key.
A clustered index often allows for efficient queries on columns that are often searched for ranges of values (between, >, etc.).
Clustering can speed up queries where data is commonly sorted by a specific column or columns.
A clustered index can be rebuilt or reorganized on demand to control table fragmentation.
These benefits can even be applied to views.
You may not want to have a clustered index on:
Columns that have frequent data changes, as SQL Server must then physically re-order the data in storage.
Columns that are already covered by other indexes.
Wide keys, as the clustered index is also used in non-clustered index lookups.
GUID columns, which are larger than identities and also effectively random values (not likely to be sorted upon), though newsequentialid() could be used to help mitigate physical reordering during inserts.
A rare reason to use a heap (table without a clustered index) is if the data is always accessed through nonclustered indexes and the RID (SQL Server internal row identifier) is known to be smaller than a clustered index key.
Because of these and other considerations, such as your particular application workloads, you should carefully select your clustered indexes to get maximum benefit for your queries.
Also note that when you create a primary key on a table in SQL Server, it will by default create a unique clustered index (if it doesn't already have one). This means that if you find a table that doesn't have a clustered index, but does have a primary key (as all tables should), a developer had previously made the decision to create it that way. You may want to have a compelling reason to change that (of which there are many, as we've seen). Adding, changing or dropping the clustered index requires rewriting the entire table and any non-clustered indexes, so this can take some time on a large table.
I would not say "Every table should have a clustered index", I would say "Look carefully at every table and how they are accessed and try to define a clustered index on it if it makes sense". It's a plus, like a Joker, you have only one Joker per table, but you don't have to use it. Other database systems don't have this, at least in this form, BTW.
Putting clustered indices everywhere without understanding what you're doing can also kill your performance (in general, the INSERT performance because a clustered index means physical re-ordering on the disk, or at least it's a good way to understand it), for example with GUID primary keys as we see more and more.
So, read Tim Lehner's exceptions and reason.
Performance is a big hairy problem. Make sure you are optimizing for the right thing.
Free advice is always worth it's price, and there is no substitute for actual experimentation.
The purpose of an index is to find matching rows and help retrieve the data when found.
A non-clustered index on your search criteria will help to find rows, but there needs to be additional operation to get at the row's data.
If there is no clustered index, SQL uses an internal rowId to point to the location of the data.
However, If there is a clustered index on the table, that rowId is replaced by the data values in the clustered index.
So the step of reading the rows data would not be needed, and would be covered by the values in the index.
Even if a clustered index isn't very good at being selective, if those keys are frequently most or all of the results requested - it may be helpful to have them as the leaf of the non-clustered index.
Yes you should have clustered index on a table.So that all nonclustered indexes perform in better way.
Consider using a clustered index when Columns that contain a large number of distinct values so to avoid the need for SQL Server to add a "uniqueifier" to duplicate key values
Disadvantage : It takes longer to update records if only when the fields in the clustering index are changed.
Avoid clustering index constructions where there is a risk that many concurrent inserts will happen on almost the same clustering index value
Searches against a nonclustered index will appear slower is the clustered index isn't build correctly, or it does not include all the columns needed to return the data back to the calling application. In the event that the non-clustered index doesn't contain all the needed data then the SQL Server will go to the clustered index to get the missing data (via a lookup) which will make the query run slower as the lookup is done row by row.
Yes, every table should have a clustered index. The clustered index sets the physical order of data in a table. You can compare this to the ordering of music at a store, by bands name and or Yellow pages ordered by a last name. Since this deals with the physical order you can have only one it can be comprised by many columns but you can only have one.
It’s best to place the clustered index on columns often searched for a range of values. Example would be a date range. Clustered indexes are also efficient for finding a specific row when the indexed value is unique. Microsoft SQL will place clustered indexes on a PRIMARY KEY constraint automatically if no clustered indexes are defined.
Clustered indexes are not a good choice for:
Columns that undergo frequent changes
This results in the entire row moving (because SQL Server must keep
the data values of a row in physical order). This is an important
consideration in high-volume transaction processing systems where
data tends to be volatile.
Wide keys
The key values from the clustered index are used by all
nonclustered indexes as lookup keys and therefore are stored in each
nonclustered index leaf entry.
This question already has answers here:
What are the differences between a clustered and a non-clustered index?
(13 answers)
Closed 7 years ago.
I need to add proper index to my tables and need some help.
I'm confused and need to clarify a few points:
Should I use index for non-int columns? Why/why not
I've read a lot about clustered and non-clustered index yet I still can't decide when to use one over the other. A good example would help me and a lot of other developers.
I know that I shouldn't use indexes for columns or tables that are often updated. What else should I be careful about and how can I know that it is all good before going to test phase?
A clustered index alters the way that the rows are stored. When you create a clustered index on a column (or a number of columns), SQL server sorts the table’s rows by that column(s). It is like a dictionary, where all words are sorted in alphabetical order in the entire book.
A non-clustered index, on the other hand, does not alter the way the rows are stored in the table. It creates a completely different object within the table that contains the column(s) selected for indexing and a pointer back to the table’s rows containing the data. It is like an index in the last pages of a book, where keywords are sorted and contain the page number to the material of the book for faster reference.
You really need to keep two issues apart:
1) the primary key is a logical construct - one of the candidate keys that uniquely and reliably identifies every row in your table. This can be anything, really - an INT, a GUID, a string - pick what makes most sense for your scenario.
2) the clustering key (the column or columns that define the "clustered index" on the table) - this is a physical storage-related thing, and here, a small, stable, ever-increasing data type is your best pick - INT or BIGINT as your default option.
By default, the primary key on a SQL Server table is also used as the clustering key - but that doesn't need to be that way!
One rule of thumb I would apply is this: any "regular" table (one that you use to store data in, that is a lookup table etc.) should have a clustering key. There's really no point not to have a clustering key. Actually, contrary to common believe, having a clustering key actually speeds up all the common operations - even inserts and deletes (since the table organization is different and usually better than with a heap - a table without a clustering key).
Kimberly Tripp, the Queen of Indexing has a great many excellent articles on the topic of why to have a clustering key, and what kind of columns to best use as your clustering key. Since you only get one per table, it's of utmost importance to pick the right clustering key - and not just any clustering key.
GUIDs as PRIMARY KEY and/or clustered key
The clustered index debate continues
Ever-increasing clustering key - the Clustered Index Debate..........again!
Disk space is cheap - that's not the point!
Marc
You should be using indexes to help SQL server performance. Usually that implies that columns that are used to find rows in a table are indexed.
Clustered indexes makes SQL server order the rows on disk according to the index order. This implies that if you access data in the order of a clustered index, then the data will be present on disk in the correct order. However if the column(s) that have a clustered index is frequently changed, then the row(s) will move around on disk, causing overhead - which generally is not a good idea.
Having many indexes is not good either. They cost to maintain. So start out with the obvious ones, and then profile to see which ones you miss and would benefit from. You do not need them from start, they can be added later on.
Most column datatypes can be used when indexing, but it is better to have small columns indexed than large. Also it is common to create indexes on groups of columns (e.g. country + city + street).
Also you will not notice performance issues until you have quite a bit of data in your tables. And another thing to think about is that SQL server needs statistics to do its query optimizations the right way, so make sure that you do generate that.
A comparison of a non-clustered index with a clustered index with an example
As an example of a non-clustered index, let’s say that we have a non-clustered index on the EmployeeID column. A non-clustered index will store both the value of the
EmployeeID
AND a pointer to the row in the Employee table where that value is actually stored. But a clustered index, on the other hand, will actually store the row data for a particular EmployeeID – so if you are running a query that looks for an EmployeeID of 15, the data from other columns in the table like
EmployeeName, EmployeeAddress, etc
. will all actually be stored in the leaf node of the clustered index itself.
This means that with a non-clustered index extra work is required to follow that pointer to the row in the table to retrieve any other desired values, as opposed to a clustered index which can just access the row directly since it is being stored in the same order as the clustered index itself. So, reading from a clustered index is generally faster than reading from a non-clustered index.
In general, use an index on a column that's going to be used (a lot) to search the table, such as a primary key (which by default has a clustered index). For example, if you have the query (in pseudocode)
SELECT * FROM FOO WHERE FOO.BAR = 2
You might want to put an index on FOO.BAR. A clustered index should be used on a column that will be used for sorting. A clustered index is used to sort the rows on disk, so you can only have one per table. For example if you have the query
SELECT * FROM FOO ORDER BY FOO.BAR ASCENDING
You might want to consider a clustered index on FOO.BAR.
Probably the most important consideration is how much time your queries are taking. If a query doesn't take much time or isn't used very often, it may not be worth adding indexes. As always, profile first, then optimize. SQL Server Studio can give you suggestions on where to optimize, and MSDN has some information1 that you might find useful
faster to read than non cluster as data is physically storted in index order
we can create only one per table.(cluster index)
quicker for insert and update operation than a cluster index.
we can create n number of non cluster index.
I have a junction table in my SQL Server 2005 database that consist of two columns:
object_id (uniqueidentifier)
property_id (integer)
These values together make a compound primary key.
What's the best way to create this PK index for SELECT performance?
If the columns were two integers, I would just use a compound clustered index (the default). However, I've heard bad things about clustered indexes when uniqueidentifiers are involved.
Anyone have experience with this situation?
Yes, GUID's are really bad for clustered indexes, since the GUIDs is by design very random and thus leads to massive fragmentation and thus performance problems.
See Kim Tripp's blog - most notably "The CLustered Index Debate continues" and "GUIDs as PRIMARY and/or CLUSTERED key" - for a lot of valuable background info.
If you really need to have an index on these TWO columns, I'd suggest a non-clustered index - it can be a primary index - just better not a clustered index.
Marc
One alternative is to use what is known as a surrogate key (which incidentally can also be assigned as the primary key).
For example, adding an identity column that can be used to uniquely identify each row within the table i.e. a primary key.
Understand that a GUID is used to identify a record globally within SQL Server (which arguably is not a relationally correct practice however that is not a concern for us here).
The identity column, now also a primary key can/will have a clustered index applied. A separate, nonclustered index can then be applied to the compound key described by the original poster.
This practice avoids the issue of frequent page splits occurring within the clustered index (inserts into a random GUID primary key) as well as producing a smaller and more efficient clustered index, whilst also preserving the relationships defined within the database.
Surrogate Key Definition: http://en.wikipedia.org/wiki/Surrogate_key
I i would create an identity column & then make this your primary key & clustered index. You can then create non clustered indexes on objectid propertyid as needed.
You can create a unique constraint to ensure uniqueness of your key.
The reason for this is that the rows will be inserted sequentially, so your reducing page splits. in addition using an integer for your PK means you have a smaller value for your clustered index.
I have a series of questions about Keys, Indexes and Constraints in SQL, SQL 2005 in particular. I have been working with SQL for about 4 years but I have never been able to get definitive answers on this topic and there is always contradictory info on blog posts, etc. Most of the time tables I create and use just have an Identity column that is a Primary Key and other tables point to it via a Foreign Key.
With join tables I have no Identity and create a composite Primary Key over the Foreign Key columns. The following is a set of statements of my current beliefs, which may be wrong, please correct me if so, and other questions.
So here goes:
As I understand it the difference between a Clustered and Non Clustered Index (regardless of whether it is Unique or not) is that the Clustered Index affects the physical ordering of data in a table (hence you can only have one in a table), whereas a Non Clustered Index builds a tree data structure. When creating Indexes why should I care about Clustered vs Non Clustered? When should I use one or the other? I was told that inserting and deleting are slow with Non-Clustered indexes as the tree needs to be "rebuilt." I take it Clustered indexes do not affect performance this way?
I see that Primary Keys are actually just Clustered Indexes that are Unique (do they have to be clustered?). What is special about a Primary Key vs a Clustered Unique Index?
I have also seen Constraints, but I have never used them or really looked at them. I was told that the purpose of Constraints is that they are for enforcing data integrity, whereas Indexes are aimed at performance. I have also read that constraints are acually implemented as Indexes anyway so they are "the same." This doesnt sound right to me. How are constraints different to Indexes?
Clustered indexes are, as you put it correctly, the definition as to how data in a table is stored physically, i.e. you have a B-tree sorted using the clustering key and you have the data at the leaf level.
Non-clustered indexes on the other hand are separate tree structures which at the leaf level only have the clustering key (or a RID if the table is a heap), meaning that when you use a non-clustered index, you'll have to use the clustered index to get the other columns (unless your request is fully covered by the non-clustered index, which can happen if you request only the columns, which constitute the non-clustered index key columns).
When should you use one or the other ? Well, since you can have only one clustered index, define it on the columns which makes most sense, i.e. when you look up clients by ID most of the time, define a clustered index on the ID. Non-clustered indexes should be defined on columns which are used less often.
Regarding performance, inserts or updates that change the index key are always painfull, regardless of whether it is a clusted on non-clustered index, since page splits can happen, which forces data to be moved between pages (moving the pages of a clustered index hurts more, since you have more data in the leaf level). Thus the general rule is to avoid changing the index key and inserting new values so that they would be sequencial. Otherwise you'll encounter fragmentation and will have to rebuild your index on a regular basis.
Finally, regarding constraints, by definition, they have nothing to do with indexes, yet SQL server has chosen to implement them using indexes. E.g. currently, a unique constraint is implemented as an index, however this can change in a future version (though I doubt that will happen). The type of index (clustered or not) is up to you, just remember that you can have only one clustered index.
If you have more questions of this type, I highly recommend reading this book, which covers these topics in depth.
Your assumption about the clustered vs non-clustered is pretty good
It also seems that primary key enforces non null uniquenes, while the unique index does not enforce non null primary vs unique
The primary key is a logical concept in relational database theory - it's a key (and typically also an index) which is designed to uniquely identify any of your rows. Therefore it must be unique and it cannot be NULL.
The clustering key is a storage-physical concept of SQL Server specifically. It's a special index that isn't just used for lookups etc., but also defines the physical structure of your data in your table. In a printed phonebook in Western European culture (except maybe for Iceland ), the clustered index would be "LastName, FirstName".
Since the clustering index defines your physical data layout, you can only ever have one of those (or none - not recommended, though).
Requirements for a clustering key are:
must be unique (if not, SQL Server will add a 4-byte "uniqueifier")
should be stable (never changing)
should be as small as possible (INT is best)
should be ever-increasing (think: IDENTITY)
SQL Server makes your primary key the clustering key by default - but you can change that if you need to. Also, mind you: the columns that make up the clustering key will be added to each and every entry of each and every non-clustered index on your table - so you want to keep your clustering key as small as possible. This is because the clustering key will be used to do the "bookmark lookup" - if you found an entry in a non-clustered index (e.g. a person by their social security number) and now you need to grab the entire row of data to get more details, you need to do a lookup, and for this, the clustering key is used.
There's a great debate about what makes a good or useful clustering and/or primary key - here's a few excellent blog posts to read about this:
all of Kimberly Tripp's Indexing blog posts are a must-read
GUIDs as primary key and/or clustering key
The Clustered index debate continues....
Marc
You have several questions. I'll break some of them out:
When creating Indexes why should I care about Clustered vs Non Clustered?
Sometimes you do care how the rows are organized. It depends on your data and how you will use it. For example, if your primary key is a uniqueidentifier, you may not want it to be CLUSTERED, because GUID values are essentially random. This will cause SQL to insert rows randomly throughout the table, causing page splits which hurt performance. If your primary key value will always increment sequentially (int IDENTITY for example), then you probably want it to be CLUSTERED, so your table will always grow at the end.
A primary key is CLUSTERED by default, and most of the time you don't have to worry about it.
I was told that inserting and deleting are slow with Non-Clustered indexes as the tree needs to be "rebuilt." I take it Clustered indexes do not affect performance this way?
Actually, the opposite can be true. NONCLUSTERED indexes are kept as a separate data structure, but the structure is designed to allow some modification without needing to be "re-built". When the index is initially created, you can specify the FILLFACTOR, which specifies how much free space to leave on each page of the index. This allows the index to tolerate some modification before a page split is necessary. Even when a page split must occur, it only affects the neighboring pages, not the entire index.
The same behavior applies to CLUSTERED indexes, but since CLUSTERED indexes store the actual table data, page splitting operations on the index can be much more expensive because the whole row may need to be moved (versus just the key columns and the ROWID in a NONCLUSTERED index).
The following MSDN page talks about FILLFACTOR and page splits:
http://msdn.microsoft.com/en-us/library/aa933139(SQL.80).aspx
What is special about a Primary Key vs a Clustered Unique Index?
How are constraints different to Indexes?
For both of these I think it's more about declaring your intentions. When you call something a PRIMARY KEY you are declaring that it is the primary method for identifying a given row. Is a PRIMARY KEY physically different from a CLUSTERED UNIQUE INDEX? I'm not sure. The behavior is essentially the same, but your intentions may not be clear to someone working with your database.
Regarding constraints, there are many types of constraints. For a UNIQUE CONSTRAINT, there isn't really a difference between that and a UNIQUE INDEX, other than declaring your intention. There are other types of constraints that do not map directly to a type of index, such as CHECK constraints, DEFAULT constraints, and FOREIGN KEY constraints.
I don't have time to answer this in depth, so here is some info off the top of my head:
You're right about clustered indexes. They rearrange the physical data according to the sort order of the clustered index. You can use clustered indexes specifically for range-bound queries (e.g. between dates).
PKs are by default clustered, but they don't have to be. That's just a default setting. The PK is supposed to be a UID for the row.
Constraints can be implemented as indexes (for example, unique constraints), but can also be implemented as default values.