How do I switch off the default index on primary keys
I dont want all my tables to be indexed (sorted) but they must have a primary key
You can define a primary key index as NONCLUSTERED to prevent the table rows from being ordered according to the primary key, but you cannot define a primary key without some associated index.
Tables are always unsorted - there is no "default" order for a table and the optimiser may or may not choose to use an index if one exists.
In SQL Server an index is effectively the only way to implement a key. You get a choice between clustered or nonclustered indexes - that is all.
The means by which SQL Server implements Primary and Unique keys is by placing an index on those columns. So you cannot have a Primary Key (or Unique constraint) without an index.
You can tell SQL Server to use a nonclustered index to implement these indexes. If there are only nonclustered indexes on a table (or no indexes at all), you have a heap. It's pretty rare that this is what you actually want.
Just because a table has a clustered index, this in no way indicates that the rows of the table will be returned in the "order" defined by such an index - the fact that the rows are usually returned in that order is an implementation quirk.
And the actual code would be:
CREATE TABLE T (
Column1 char(1) not null,
Column2 char(1) not null,
Column3 char(1) not null,
constraint PK_T PRIMARY KEY NONCLUSTERED (Column2,Column3)
)
What does " I dont want all my tables to be sorted" mean ? If it means that you want the rows to appear in the order where they've been entered, there's only one way to garantee it: have a field that stores that order (or the time if you don't have a lot of transactions). And in that case, you will want to have a clustered index on that field for best performance.
You might end up with a non clustered PK (like the productId) AND a clustered unique index on your autonumber_or_timestamp field for max performance.
But that's really depending on the reality your're trying to model, and your question contains too little information about this. DB design is NOT abstract thinking.
Related
I have a Table to make a Clustered Primary Key.
CREATE TABLE dbo.SampleTable
(
C1 INT NOT NULL,
C2 INT NOT NULL )
First Way is making Primary Key index with Clustered index.
ALTER TABLE dbo.SampleTable ADD CONSTRAINT IDX_SampleTable PRIMARY KEY CLUSTERED (C1, C2)
Second Way is CREATE CLUSTERED INDEX after ADD CONSTRAINT PRIMARY KEY NONCLUSTERED about same columns.
ALTER TABLE dbo.SampleTable ADD CONSTRAINT IDX_SampleTable PRIMARY KEY NONCLUSTERED (C1, C2)
CREATE CLUSTERED INDEX IDX_SampleTable2 ON dbo.SampleTable (C1 ,C2) -- Can not create Same Name With above Constraint Name
Is there a difference in performance from the above two methods?
Is there a way do not recommend using it?
Yes, there is a difference. By specifying CLUSTERED, you instruct the database to store the data in a certain way. Basically, it enforces that subsequent indexes are stored on subsequent data blocks on the hard drive.
By creating a clustered primary key as in your first statement, all the data in the table will always have unique values in C1, C2 and the data is always stored in subsequent data blocks.
In the second example, you do NOT enforce this CLUSTERED behaviour through the primary key, but through a separate index. Though the effects are the same now, you might choose to remove (or temporarily disable) the index and then the data would no longer be guaranteed to get stored in a CLUSTERED fashion.
Bottom line: In practice these two statements are the same now, but might make a difference in the future because the CLUSTERED property is not integrated in the PK, but in a separate index.
Creating a Nonclustered Primary Key and then creating a Clustered index on the columns within the Primary key is not a good idea. Effectively you'll create 2 indexes on the columns (C1 and C2 in this case), however, it's very unlikely the nonclustered index will ever be used. This is because the Clustered Index is very likely going to be the first choice for the RDBMS, as the pages will be in the order of the Clustered Index. Also, when using a non-clustered index the data engine will still need to refer to the Clustered Index afterwards, to find out the exact location of the row (in the pages).
If you do want a clustered index on your Primary Key(s) then create the key as a Clustered Primary Key. This is not to say that your Primary Key should always be Clustered, but that is a very different subject.
This depends from your datas:
https://learn.microsoft.com/en-gb/sql/relational-databases/indexes/clustered-and-nonclustered-indexes-described?view=sql-server-2017
Clustered indexes sort and store the data rows in the table or view
based on their key values. These are the columns included in the index
definition. There can be only one clustered index per table, because
the data rows themselves can be stored in only one order.
So the clustered key influence the format of your physical data structure.
Consider a sql table Employer.
Column A -- (int) Unique identity column. Not used in select queries as part of the where clause.
Column B -- (int) Non unique column. Used in select queries as part of the where clause very often.
Which of the below choices of index are better for deigning the database table to achieve good performance with low maintenance
1) 1 clustered, unique, primary key on Column A and 1 non clustered index on column B
OR
2) 1 clustered, unique, primary key on Column B, A (Composite primary key)
OR
3) 1 non clustered, unique, primary key on Column A and 1 clustered index on column B
Any other suggestions are also welcome. Thanks in advance
Go with option 1.
When creating primary key always aim at shortest possible column. Reason for it is that this column will be used in all other indexes. Also I hope when you say identity you don't mean randomly generated value, because it will hurt your performance for writing.
Choose first , second will fill large space, because it is clustered and difficult to accompany, because it is composite. To choose first you will have small index and is easy to accompany
Given the database table:
UserID (PK)
SomeTypeID (PK)
SomeSubTypeID (PK)
Data
And you wish to query:
SELECT Data FROM Table WHERE UserID = {0} AND SomeTypeID = {1} AND SomeSubTypeID = {2}
Would you need to create the index UserID, SomeTypeID, SomeSubTypeID or does the fact they form the primary key mean this is not needed?
If you created your primary key as:
CREATE TABLE TBL (UserID, SomeTypeID, SomeSubType, Data
CONSTRAINT PK PRIMARY KEY (UserID, SomeTypeID, SomeSubType))
Then the default index that is being created is a CLUSTERED index.
Usually (so not all times), when looking for data, you would want your queries to use a NON-CLUSTERED index to filter rows, where the columns you use to filter rows will form the key of the index and the information (column) that you return from those rows as an INCLUDED column, in this case DATA, like below:
CREATE NONCLUSTERED INDEX ncl_indx
ON TBL (UserID, SomeTypeID, SomeSubType) INCLUDE (Data);
By doing this, you're avoiding accessing the table data, through the CLUSTERED index.
But, you can specify the type of index that you want your PRIMARY KEY to be, so:
CREATE TABLE TBL (UserID, SomeTypeID, SomeSubType, Data
CONSTRAINT PK PRIMARY KEY NONCLUSTERED (UserID, SomeTypeID, SomeSubType));
Buuut, because you want this to be defined as a PRIMARY KEY then you are not able to use the INCLUDE functionality, so you can't avoid the disk lookup in order to get the information from the DATA column, which is where you basically are with having the default CLUSTERED index.
Buuuuuut, there's still a way to ensure the uniqueness that the Primary Key gives you and benefit from the INCLUDE functionality, so as to do as fewer disk I/O's.
You can specify your NONCLUSTERED INDEX as UNIQUE which will ensure that all of your 3 columns that make up the index key are unique.
CREATE UNIQUE NONCLUSTERED INDEX ncl_indx
ON TBL (UserID, SomeTypeID, SomeSubType) INCLUDE (Data);
By doing all of these then your table is going to be a HEAP, which is not a very good thing. If you've given it a good thought in designing your tables and decided that the best clustering key for your CLUSTERED INDEX is (UserID, SomeTypeID, SomeSubType), then it's best to leave everything as you currently have it.
Otherwise, if you have decided on a different clustering key then you can add this unique nonclustered index, if you're going to query the table as you said you will.
AS long as you use all the columns used in your primary key when filtering you don't need to create seperate indexes. Your primary key is ok in your example.
Think of creating seperate index if you plan to filter on one of the columns and not the others. For example: SELECT Data FROM Table WHERE UserID = {0}
I am using SQL Server 2012 & am creating a table that will have 8 columns, types below
datetime
varchar(12)
varchar(6)
varchar(100)
float
float
int
datetime
Once a day (normally) there will be an upload of approx 10,000 rows of data. Going forward its possible it could be 100,000.
The rows will be unique if I group on the first three columns listed above. I have read I can use the unique constraint on multiple columns which will guarantee the rows are unique.
I think I'm correct in saying that the unique constraint by default sets up non-clustered index. Would a clustered index be better & assuming when the table starts to contain millions of rows this won't cause any issues?
My last question. By applying the unique constraint on my table I am right to say querying the data will be quicker than if the unique constraint wasn't applied (because of the non-clustering or clustering) & uploading the data will be slower (which is fine) with the constraint on the table?
Unique index can be non-clustered.
Primary key is unique and can be clustered
Clustered index is not unique by default
Unique clustered index is unique :)
Mor information you can get from this guide.
So, we should separate uniqueness and index keys.
If you need to kepp data unique by some column - create uniqe contraint (unique index). You'll protect your data.
Also, you can create primary key (PK) on your columns - they will be unique also. But, there is a difference: all other indexies will use PK for referencing, so PK must be as short as possible. So, my advice - create Identity column (int or bigint) and create PK on it. And, create unique index on your unique columns.
Querying data may become faster, if you do queries on your unique columns, if you do query on other columns - you need to create other, specific indexies.
So, unique keys - for data consistency, indexies - for queries.
I think I'm correct in saying that the unique constraint by default
sets up non-clustered index
TRUE
Would a clustered index be better & assuming when the table starts to
contain millions of rows this won't cause any issues?
(1)if u need to make (datetime ,varchar(12), varchar(6)) Unique
(2)if you application or you will access rows using datetime or datetime ,varchar(12) or datetime ,varchar(12), varchar(6) in where condition
ALL the time
then have primary key on (datetime ,varchar(12), varchar(6))
by default it will put Uniqness and clustered index on all above three column.
but as you commented above:
the queries will vary to be honest. I imagine most queries will make
use of the first datetime column
and you will deal with huge data and might join this table with other tables
then its better have a surrogate key( ever-increasing unique identifier ) in the table and to satisfy your Selects
have Non-Clustered INDEXES
Surrogate Key vs Business Key
NON-CLUSTERED INDEX
Observe the following table model:
CREATE TABLE [site].[Permissions] (
[ID] INT REFERENCES [site].[Accounts]( [ID] ) NOT NULL,
[Type] SMALLINT NOT NULL,
[Value] INT NULL
);
The site.Accounts->site.Permissions is a one-to-many relationship so 'ID' cannot be made a primary key due to the uniqueness that a PK imposes.
The rows are selected using a WHERE [ID] = ? clause, so adding a phoney IDENTITY column and making it the PK yields no benefit at the cost of additional disk space.
Its my understanding that the targeted platform - SQL Server (2008) - does not support composite PKs. These all add up to my question: If a Primary Key is not used, so something wrong? Or could something be more right?
Your understanding is not correct, SQL Server does support composite primary keys!
The syntax to add one would be
ALTER TABLE [site].[Permissions]
ADD CONSTRAINT PK_Permissions PRIMARY KEY CLUSTERED (id,[Type])
Regarding the question in the comments "What is the benefit of placing a PK on the entire table?"
I'm not sure from your description though what the PK would need to be on. Is it all 3 columns or just 2 of them? If it's on id,[Type] then presumably you wouldn't want the possibility that the same id,[Type] combo could appear multiple times with conflicting values.
If it is on all 3 columns then to turn the question around why wouldn't you want a primary key?
If you are going to have a clustered index on your table you could just make that the primary key. If say you made a clustered index on the id column only SQL Server would add in uniqueifiers anyway to make it unique and your columns are so narrow (int,smallint,int) this just seems a pointless addition.
Additionally the query optimiser can use unique constraints to improve its query plans (though might not apply if the only queries on that table really are WHERE [ID] = ?) and it would be pretty wasteful to allow duplicates that you then have to both store and filter out with DISTINCT.