I have a table that I intended to partition by a nullable column.
This seems to work just fine except for the primary key. I get an error:
Partition columns for a unique index must be a subset of the index key
Create a primary key on a different filegroup. This doesn't work because it removes partitioning.
Skip the primary key all together and create a clustered index (non-unique). This won't work exactly because I need a primary key.
Any idea on how I can get a primary key on a partitioned table where the partition column is nullable? If not, I am open to suggestions on how to handle it another way.
Thanks in advance.
Not sure what really blocked you. You can create PK on your unique column, and have your partition column with nullable. Just not to only create unique cluster index on only PK column. When you need to create unique cluster index, add you PK column and the partition column together.
Related
Given the database table:
UserID (PK)
SomeTypeID (PK)
SomeSubTypeID (PK)
Data
And you wish to query:
SELECT Data FROM Table WHERE UserID = {0} AND SomeTypeID = {1} AND SomeSubTypeID = {2}
Would you need to create the index UserID, SomeTypeID, SomeSubTypeID or does the fact they form the primary key mean this is not needed?
If you created your primary key as:
CREATE TABLE TBL (UserID, SomeTypeID, SomeSubType, Data
CONSTRAINT PK PRIMARY KEY (UserID, SomeTypeID, SomeSubType))
Then the default index that is being created is a CLUSTERED index.
Usually (so not all times), when looking for data, you would want your queries to use a NON-CLUSTERED index to filter rows, where the columns you use to filter rows will form the key of the index and the information (column) that you return from those rows as an INCLUDED column, in this case DATA, like below:
CREATE NONCLUSTERED INDEX ncl_indx
ON TBL (UserID, SomeTypeID, SomeSubType) INCLUDE (Data);
By doing this, you're avoiding accessing the table data, through the CLUSTERED index.
But, you can specify the type of index that you want your PRIMARY KEY to be, so:
CREATE TABLE TBL (UserID, SomeTypeID, SomeSubType, Data
CONSTRAINT PK PRIMARY KEY NONCLUSTERED (UserID, SomeTypeID, SomeSubType));
Buuut, because you want this to be defined as a PRIMARY KEY then you are not able to use the INCLUDE functionality, so you can't avoid the disk lookup in order to get the information from the DATA column, which is where you basically are with having the default CLUSTERED index.
Buuuuuut, there's still a way to ensure the uniqueness that the Primary Key gives you and benefit from the INCLUDE functionality, so as to do as fewer disk I/O's.
You can specify your NONCLUSTERED INDEX as UNIQUE which will ensure that all of your 3 columns that make up the index key are unique.
CREATE UNIQUE NONCLUSTERED INDEX ncl_indx
ON TBL (UserID, SomeTypeID, SomeSubType) INCLUDE (Data);
By doing all of these then your table is going to be a HEAP, which is not a very good thing. If you've given it a good thought in designing your tables and decided that the best clustering key for your CLUSTERED INDEX is (UserID, SomeTypeID, SomeSubType), then it's best to leave everything as you currently have it.
Otherwise, if you have decided on a different clustering key then you can add this unique nonclustered index, if you're going to query the table as you said you will.
AS long as you use all the columns used in your primary key when filtering you don't need to create seperate indexes. Your primary key is ok in your example.
Think of creating seperate index if you plan to filter on one of the columns and not the others. For example: SELECT Data FROM Table WHERE UserID = {0}
I am using SQL Server 2012 & am creating a table that will have 8 columns, types below
datetime
varchar(12)
varchar(6)
varchar(100)
float
float
int
datetime
Once a day (normally) there will be an upload of approx 10,000 rows of data. Going forward its possible it could be 100,000.
The rows will be unique if I group on the first three columns listed above. I have read I can use the unique constraint on multiple columns which will guarantee the rows are unique.
I think I'm correct in saying that the unique constraint by default sets up non-clustered index. Would a clustered index be better & assuming when the table starts to contain millions of rows this won't cause any issues?
My last question. By applying the unique constraint on my table I am right to say querying the data will be quicker than if the unique constraint wasn't applied (because of the non-clustering or clustering) & uploading the data will be slower (which is fine) with the constraint on the table?
Unique index can be non-clustered.
Primary key is unique and can be clustered
Clustered index is not unique by default
Unique clustered index is unique :)
Mor information you can get from this guide.
So, we should separate uniqueness and index keys.
If you need to kepp data unique by some column - create uniqe contraint (unique index). You'll protect your data.
Also, you can create primary key (PK) on your columns - they will be unique also. But, there is a difference: all other indexies will use PK for referencing, so PK must be as short as possible. So, my advice - create Identity column (int or bigint) and create PK on it. And, create unique index on your unique columns.
Querying data may become faster, if you do queries on your unique columns, if you do query on other columns - you need to create other, specific indexies.
So, unique keys - for data consistency, indexies - for queries.
I think I'm correct in saying that the unique constraint by default
sets up non-clustered index
TRUE
Would a clustered index be better & assuming when the table starts to
contain millions of rows this won't cause any issues?
(1)if u need to make (datetime ,varchar(12), varchar(6)) Unique
(2)if you application or you will access rows using datetime or datetime ,varchar(12) or datetime ,varchar(12), varchar(6) in where condition
ALL the time
then have primary key on (datetime ,varchar(12), varchar(6))
by default it will put Uniqness and clustered index on all above three column.
but as you commented above:
the queries will vary to be honest. I imagine most queries will make
use of the first datetime column
and you will deal with huge data and might join this table with other tables
then its better have a surrogate key( ever-increasing unique identifier ) in the table and to satisfy your Selects
have Non-Clustered INDEXES
Surrogate Key vs Business Key
NON-CLUSTERED INDEX
I have a Table with the following columns
ID (INT Primary Key)
RecordDate (DateTime non-unique)
Name (varchar)
I have partitioned the table based on field RecordDate (Monthly) to different file groups.
Now, how can I add a primary key ID to this partitioned scheme with out combining field with RecordDate?
The short answer is that your primary key cannot be clustered if you don't want to include your partition column in your primary key as a composite column. So create a clustered index on the partition scheme for RecordDate. Then when you create your primary key constraint, set it to nonclustered.
Please note this can degrade performance and cause memory contention, and is generally not recommended.
I have some tables that have Guids as the PK. Guids must be the primary key as we will be working disconnected. I have added a INT Identity column that I will be adding the clustered index on for each table. What I want to ask after reading all the questions on Guids as PK -- Do I need to add a non clustered index on the Guid PK as well? Thanks.
Why not just use newsequentialid() rather than newid() and leave the clustered index on the PK?
But to answer your question, the primary key is always an index, so there's no need to create another one.
I need to alter the length of a column column_length in say more than 500 tables and the tables might have no of records ranging from 10 records to 3 or 4 million records.
The column may just be a normal column
CREATE TABLE test(column_length varchar(10))
The column might contain non-clustered index on it.
CREATE TABLE test(column_length varchar(10))
CREATE UNIQUE NONCLUSTERED INDEX column_length_ind ON test (column_length)
The column might contain PRIMARY KEY clustered index on it
CREATE TABLE test(column_length varchar(10))
ALTER TABLE test ADD PRIMARY KEY CLUSTERED INDEX ON column_length
The column might be a composite primary key
The column might have a foreign key reference
In short the column column_length might be anything.
All I need is to create scripts to alter the length of the column_length from varchar(10) to varchar(50). Should I drop the indexes before altering and then recreate them? What about the primary key and foreign key?
Through my research and testing I figured out that I can just alter the column's length without dropping the primary key or any indexes but have to drop and recreate the foreign key alone.
Is this assumption right?
Yes you should be able to just modify the columns. From my experience it is faster to leave the index and primary key in place.
Likely you will need to do alter column on the foreign key tables as well to increase the size. SO first you drop the fk constraint, then fix the forign kkey fields, then fix the primary key field then put the constraints back on.