I need to update a large number of keys in a large SQL Server 2005 database and will be dropping FKs and PKs on a bunch of tables, doing the update (which replaces the values of the PK/FK) and then adding the FK and PK again.
My questions are:
Will this process have any effect on exsiting indexes that exist on those tables, either
indexes that include the PK/FK fields or indexes on other unaffected fields. ie will all indexes still exists, will they need a rebuild?
Will this process affect table statistics, requiring a recalc?
Many thanks
If you drop a PK (which is usually a clustered index) SQL Server will drop and recreate all non clustered indexes(this is needed because if you have a clustered index the non clustered indexes point to the clustered index). If you don't have a clustered index (a heap) the non clustered indexes point to the data row
rebuilding a clustered index will automatically update statistics, a reorg won't
If you created the keys with cascade update then they should be updated automatically
example
create table pri(id int not null primary key)
go
create table ForeignK(fid int not null)
go
ALTER TABLE dbo.ForeignK ADD CONSTRAINT
FK_ForeignK_pri FOREIGN KEY
(fid) REFERENCES dbo.pri(id) ON UPDATE CASCADE
ON DELETE NO ACTION
insert pri values(1)
insert ForeignK values(1)
now update the PK table
update pri set id = 5
go
this will now be 5 also
select * from ForeignK
Every change in indexed column has a propagation in structure change.
For foreign key you could disable them and then rebuild. For private key only thing you can do is to rebuild them. I think that SQLMenace explained it clearly why.
More
Related
It seems in SQL Server before version 2019, the clustering key/keys goes up to tree structure with not unique non-clustered index. With bigger and multiple clustering key/keys, you gain much more wider and taller tree that costs you more storage size and memory size.
Because of that we used to separate PK from clustered key my questions are
Have SQL Server 2019 and Azure changed in non-clustered indexing or not?
Heaps do not have clustering key/keys at all, what's the way of indexing in heaps?
Have SQL Server 2019 and Azure changed in non-clustered indexing or not
This behavior is older than many people on this site.
Because of that we used to separate PK from clustered
That is an almost-always-unnecessary micro-optimization.
Heaps do not have clustering key/keys at all, what's the way of indexing in heaps
Non-clustered non-unique indexes always have the row locator as index keys. For heaps the row locator is the ROWID (FileNo,PageNo,SlotNo).
If you want move the rows out from the leaf level of your wide PK, it's typically for a very large table. And so moving the rows to a clustered columstore index can be a good option. To do that just drop the clustered PK (this will leave the leaf level as a heap), create the CCI, and then recreate the PK as a nonclustered PK. eg
drop table if exists t
go
create table t(id int not null, a int, b int)
alter table t
add constraint pk_t
primary key clustered(id)
go
alter table t drop constraint pk_t
create clustered columnstore index cci_t on t
alter table t
add constraint pk_t
primary key nonclustered (id)
And if you have other non-clustered indexes drop them first, and only recreate them afterwords if you really need to. IE unique indexes, indexes supporting a foreign key, or indexes need to support specific queries. A CCI typically doesn't need lots of indexes since it's so efficient to scan.
I have a table with millions of records and it is taking a very long time to create the clustered index on its primary key.
What are the factors required in consideration to faster the process to creating PK clustered on it?
I have triggers, views, non-clustered indexes, foreign keys and some auto created statistics & statistics on non-clustered indexes.
Do the following:
Create a new table
Create a clustered index for it
Create the trigger and disable them
Copy all data from the old table to the new table
Finally, do the following:
sp_rename '<Current_old_Table>',<Current_old_Table_BAK>'
sp_rename '<New_Table>',<Current_Table_BAK>'
SQL Server 2012 SP3
I've got a table with a clustered unique index which I want to move to the SECONDARY filegroup.
The standard method is to ALTER TABLE ... DROP CONSTRAINT and then ALTER TABLE ... ADD CONSTRAINT it back again in the new filegroup. However, there's a chain of FKs which I'd have to drop and recreate.
Is there any other way to move the underlying data without lots of drops?
Use CREATE UNIQUE CLUSTERED INDEX...WITH(DROP_EXISTING=ON) to move the primary key index to a different filegroup. This will also avoid the sort.
CREATE UNIQUE CLUSTERED INDEX PK_YourTable ON dbo.YourTable(YourPrimaryKeyColumn)
WITH(DROP_EXISTING=ON)
ON [YourSecondaryFileGroup];
I am trying to convert tables from using guid primary keys / clustered indexes to using int identities. This is for SQL Server 2005. There are two tables MainTable and RelatedTable, and the current table structure is as follows:
MainTable [40 million rows]
IDGuid - uniqueidentifier - PK
-- [data columns]
RelatedTable [400 million rows]
RelatedTableID - uniqueidentifier - PK
MainTableIDGuid - uniqueidentifier [foreign key to MainTable]
SequenceNumber - int - incrementing number per main table entry since there can be multiple entries related to a given row in the main table. These go from 1,2,3... etc for each MainTableIDGuid value.
-- [data columns]
The clustered index for MainTable is currently the primary key (IDGuid). The clustered index for RelatedTable is currently (MainTableIDGuid, SequenceNumber).
I want my conversion is do several things:<
Change MainTable to use an integer ID instead of GUID
Add a MainTableIDInt column to related table that links to Main Table's integer ID
Change the primary key and clustered index of RelatedTable to (MainTableIDInt, SequenceNumber)
Get rid of the guid columns.
I've written a script to do the following:
Add an IDInt int IDENTITY column to MainTable. This does a table rebuild and generates the new identity ID values.
Add a MainTableIDInt int column to RelatedTable.
The next step is to populate the RelatedTable.MainTableIDInt column for each row with its corresponding MainTable.IDInt value [based on the matching guid IDs]. This is the step I'm hung up on. I understand this is not going to be speedy, but I'd like to have it perform as well as possible.
I can write a SQL statement that does this update:
UPDATE RelatedTable
SET RelatedTable.MainTableIDInt = (SELECT MainTable.IDInt FROM MainTable WHERE MainTable.IDGuid = RelatedTable.MainTableIDGuid)
or
UPDATE RelatedTable
SET RelatedTable.MainTableIDInt = MainTable.IDInt
FROM RelatedTable
LEFT OUTER JOIN MainTable ON RelatedTable.MainTableIDGuid = MainTable.IDGuid
The 'Display Estimated Execution Plan' displays roughly the same for both of these queries. The execution plan it spits out does the following:
Clustered index scans over MainTable and RelatedTable and does a Merge Join on them [estimated number of rows = 400 million]
Sorts [estimated number of rows = 400 million]
Clustered index update over RelatedTable [estimated number of rows = 400 million]
I'm concerned about the performance of this [sorting 400 million rows sounds unpleasant]. Are my concerns about performance of these execution plan justified? Is there a better way to update the new ID for my related table that will scale given the size of the tables?
First, this will be a headache. Second, I wouldn't change any of the indexes or constraints until I had the data in place. I.e., I would add the identity column but not make it the primary key nor clustered index. Then I'd add the soon-to-be new foreign keys to the various tables. Your queries should look like:
Update ChildTable
Set NewIntForeignKeyId = P.NewIntPrimaryKey
From ChildTable As C
Join ParentTable As P
On P.PrimaryKey = C.ForeignKey
First, notice that I'm using an inner join. There is no reason to use an outer join for this type of query given that you will eventually enforce referential integrity between the new columns. Second, if you populate the columns first and then rebuild the constraints, it will be faster as you'll be able to leverage the existing indexes. Remember that when you change the clustered index, it rebuilds all of the nonclustered indexes. If the tables are large, that will be a serious hit.
Once you have the data in place, I'd then drop all primary constraints, unique constraints, foreign key constraints and unique indexes. Drop the clustered index/constraint last. I'd then add the clustered indexes to all of the tables and after that was done, recreate the unique constraints, foreign key constraints and indexes. If you do not drop the existing indexes before you recreate the clustered index, it will rebuild the existing indexes twice: once when you drop the clustered index and again when you recreate it.
Btw, I highly doubt there is a way to avoid table scans for this sort of thing since you are going to be updating every row.
I need to alter the length of a column column_length in say more than 500 tables and the tables might have no of records ranging from 10 records to 3 or 4 million records.
The column may just be a normal column
CREATE TABLE test(column_length varchar(10))
The column might contain non-clustered index on it.
CREATE TABLE test(column_length varchar(10))
CREATE UNIQUE NONCLUSTERED INDEX column_length_ind ON test (column_length)
The column might contain PRIMARY KEY clustered index on it
CREATE TABLE test(column_length varchar(10))
ALTER TABLE test ADD PRIMARY KEY CLUSTERED INDEX ON column_length
The column might be a composite primary key
The column might have a foreign key reference
In short the column column_length might be anything.
All I need is to create scripts to alter the length of the column_length from varchar(10) to varchar(50). Should I drop the indexes before altering and then recreate them? What about the primary key and foreign key?
Through my research and testing I figured out that I can just alter the column's length without dropping the primary key or any indexes but have to drop and recreate the foreign key alone.
Is this assumption right?
Yes you should be able to just modify the columns. From my experience it is faster to leave the index and primary key in place.
Likely you will need to do alter column on the foreign key tables as well to increase the size. SO first you drop the fk constraint, then fix the forign kkey fields, then fix the primary key field then put the constraints back on.