I am just wondering can we create a Primary key on a table in sql server without any type of index on it?
No. As an implementation detail, SQL Server maintains the primary key using an index. You cannot prevent it from doing so. The primary key:
Ensures that no duplicate key values exist
Allows individual rows to be identified/accessed
SQL Server already has mechanisms that offer these features - unique indexes - so it uses those to enforce the constraint.
You can create a table with a primary key that is not a clustered index by adding the keyword NONCLUSTERED after the primary key word.
Actually indexing functions in the same way a book is traversed. One cannot go to a particular page or topic unless there is page number and their relation to the topics. This "Paging" (ordering) of rows is done physically by Clustered Index in SQL Server. If there is no PK in a table one can add clustered index on any unique-key qualifying column(s). As there cannot be more than one clustered index on a table, and all other non-clustered indices depend on clustered index for search/traversal, you cannot make a column PK without clustered index.
Related
By default, a SQLServer table clustered index is the PK. If I define that:
such key is a GUID and
I will generate at random a.k.a Guid.NewGuid() and
forget to set the clustered index to a more meaningful column
will SQLServer reorder the pages as a new record enters the table or will it just "ignore" the clustered part of the index?
I think you got it reversed: by default, the PK is also the clustered index. You must choose the data type and what columns to include in the PK. SQL Server won't set a default PK for you. Without a PK, a table is a heap.
Using GUID as a PK is bad practice. You will cause unnecessary page splits upon INSERT. If the data does not have a natural key, use an IDENTITY column instead.
So you create PK and it defaults to a clustered. You change it to a non clustered index. What makes you think there is a clustered index hanging around? There is no clustered part to the index if you changed it to a non clustered.
I've been using SQL Server for quite a while, I always create database with design view.
The steps I took to create the table is:
Right Click Table -> New Table
I always have the first column as SOMETHING_ID (int) -> I make SOMETHING_ID as Identity with auto increment of 1
-> Then I add other columns
-> Save and use
As you can see, I didn't define SOMETHING_ID by right clicking it and SET AS PRIMARY.
Will there be any performance impact in this case?
Yes, it can impact performance because creating the primary key essentially makes an index for it. So when you join tables on that key it will improve performance greatly if there are indexes.... particularly if you have lots of data.
What you really need to do is to create a clustered index. A primary index, by default is a clustered index (but you can create a primary index that is not a clustered index). A table without a clustered index is called a heap and except for very special occasions you should have a clustered index on every table. A primary index is a index that has only unique values and does not have any (not even one) null index value.
A query that uses a clustered index is usually a very effective one but if there is not clustered index (even if the table has indexes) it can end up with forwarding pointers all over the place and searching for all the rows for a given customer can require SQL Server to read many, many pages.
To create a clustered index on a table you can use syntax such as
create clustered index ix1_table1 on table1(id)
The column(s) used in a index of any kind can occur anywhere in a table and does not necessarily have to be identity columns.
By not creating Primary key you're breaking the rule of First Normal Form in Normalization.
Disadvantages of not having Primary Key
Chances of Duplicates
Your table won't be clustered with clustered index
You won't be able to do Primary Key-Foreign Key relationship with other table.
In SQL Server, I have a non nullable column with a unique clustered index on it.
If I make this column a Primary Key the exact same index is created automatically plus
the column is recognized as a Primary Key.
I understand the abstract/semantic difference.
(Primary Key identifies the entity, while any other column with this index may not.
For example, a Person can have Email field which is Unique,Non-nullable... but can be changed)
But what bothers me is the actual difference when it comes to the DB engine itself.
What will happen if I will just create an Id column, make it non-nullable, create a unique clustered index for it, make it Identity Increment, but without the Primary Key constraint?
In what scenarios the Primary Key constraint comes into play?
(I've looked at many related questions before asking this, but all the answers I saw ended up with an abstract/theoretical explanation).
Nothing will be different really. You specify PRIMARY KEY to relay your intentions, not so that the engine does anything differently. When constructing a query plan, the optimizer will still use the uniqueness for all of its properties, and will still use the clustered index for all of its properties, regardless of whether you technically created it as a PRIMARY KEY. When creating a FOREIGN KEY, you can still reference the column(s) specified as unique (clustered or not). The difference is solely in the metadata (sys.indexes.is_primary_key) and in SSMS' representation to you (oh and the fact that you can create a unique clustered index on a NULLable column, but you can't create a PRIMARY KEY on that column).
In fact there are many cases where you want to completely separate the clustered index from the PRIMARY KEY. If you have a table where the PK is a GUID, for example, and you are typically running date range queries against the table, you are probably better off having the PK be non-clustered and have a clustered index on a naturally increasing column (the datetime column) - both to minimize page splits on heavy insert activity and also to best assist date range queries. The non-clustered index will be perfectly fine for looking up individual GUIDs. (I wanted to mention that because a lot of people think the primary key has to be clustered. Not true.)
Also interesting to note that if you create a PRIMARY KEY constraint, then create a unique clustered index with the same name using DROP_EXISTING, the is_primary_key column will still be 1 and Object Explorer will still show the index name under Keys.
Here is one scenario - a lot of code to data mapping frameworks look at the database metadata (what are the primary keys, foreign keys, etc) to determine how code is executed. For example Hibernate requires a primary key.
A typical scenario might be generating a where clause for an update.
How does the PRIMARY KEY keyword relate to clustered indexes in SQL Server?
(Some people seem to want to answer this question instead of a different question I asked, so I am giving them a better place to do so.)
How does the PRIMARY KEY keyword related to clustered indexes in MS SqlServer?
By default, a PRIMARY KEY is implemented as a clustered index. However, you can back it by an unclustered index as well (specifying NONCLUSTERED options to its declaration)
A clustered index is not necessarily a PRIMARY KEY. It can even be non-unique (in this case, a hidden column called uniqueifier is added to each key).
Note that a clustered index is not really an index (i. e. a projection of a table ordered differently, with the references to original records). It is the table itself, with the original records ordered.
When you create a clustered index, you don't really "create" anything that you can drop apart from the table. You just rearrange the table itself and change the way the records are stored.
The clustered index of a table is normally defined on the primary key columns.
This, however is not a strict requirement.
From MSDN:
When you create a PRIMARY KEY constraint, a unique clustered index on the column or columns is automatically created if a clustered index on the table does not already exist and you do not specify a unique nonclustered index.
And:
You can create a clustered index on a column other than primary key column if a nonclustered primary key constraint was specified.
A primary key is, as the name implies, the primary unique identifier for a row in your table. A clustered index physically orders the data according to the index. Although SQL Server will cluster a primary key by default, there is no direct relationship between the two.
I have a junction table in my SQL Server 2005 database that consist of two columns:
object_id (uniqueidentifier)
property_id (integer)
These values together make a compound primary key.
What's the best way to create this PK index for SELECT performance?
If the columns were two integers, I would just use a compound clustered index (the default). However, I've heard bad things about clustered indexes when uniqueidentifiers are involved.
Anyone have experience with this situation?
Yes, GUID's are really bad for clustered indexes, since the GUIDs is by design very random and thus leads to massive fragmentation and thus performance problems.
See Kim Tripp's blog - most notably "The CLustered Index Debate continues" and "GUIDs as PRIMARY and/or CLUSTERED key" - for a lot of valuable background info.
If you really need to have an index on these TWO columns, I'd suggest a non-clustered index - it can be a primary index - just better not a clustered index.
Marc
One alternative is to use what is known as a surrogate key (which incidentally can also be assigned as the primary key).
For example, adding an identity column that can be used to uniquely identify each row within the table i.e. a primary key.
Understand that a GUID is used to identify a record globally within SQL Server (which arguably is not a relationally correct practice however that is not a concern for us here).
The identity column, now also a primary key can/will have a clustered index applied. A separate, nonclustered index can then be applied to the compound key described by the original poster.
This practice avoids the issue of frequent page splits occurring within the clustered index (inserts into a random GUID primary key) as well as producing a smaller and more efficient clustered index, whilst also preserving the relationships defined within the database.
Surrogate Key Definition: http://en.wikipedia.org/wiki/Surrogate_key
I i would create an identity column & then make this your primary key & clustered index. You can then create non clustered indexes on objectid propertyid as needed.
You can create a unique constraint to ensure uniqueness of your key.
The reason for this is that the rows will be inserted sequentially, so your reducing page splits. in addition using an integer for your PK means you have a smaller value for your clustered index.