I have a huge table which is partitioned by date.
We have 8 partitions all on different file groups, with one of these file groups being PRIMARY.
I would like to replace the PRIMARY file group with a new file group called 'FG_odsvr_misc', and remove PRIMARY from the partition schema.
How would i achieve this without creating a new table with a new partition function?
The boundaries look like below -
The partition function is as below -
CREATE PARTITION FUNCTION [fn_odstable1](numeric(9,0))
AS RANGE LEFT FOR VALUES (20151231, 20161231, 20171231, 20181231, 20191231, 20201231, 20211231)
The partition scheme is as below -
CREATE PARTITION SCHEME [sch_odstable1] AS PARTITION [fn_odstable1]
TO ([FG_odsvr_pre_2016], [FG_odsvr_2016], [FG_odsvr_2017], [FG_odsvr_2018], [FG_odsvr_2019], [FG_odsvr_2020], [FG_odsvr_2021], [PRIMARY])
Ok. The partition you have on the PRIMARY filegroup is the so-called "Permanent Partition"
From Dan Guzman's Table Partitioning Best Practices:
You might not be aware that each partition scheme has a permanent
partition that can never be removed. This is the first partition of a
RANGE RIGHT function and the last partition of a RANGE LEFT one. Be
mindful of this permanent partition when creating a new partition
scheme when multiple filegroups are involved because the filegroup on
which this permanent partition is created is determined when the
partition scheme is created and cannot be removed from the scheme.
. . .
Consider mapping partitions containing data outside the expected range
to a dummy filegroup with no underlying files. This will guarantee
data integrity much like a check constraint because data outside the
allowable range cannot be inserted. If you must accommodate errant
data rather than rejecting it outright, instead map these partitions
to a generalized filegroup like DEFAULT or one designated specifically
for that purpose.
http://www.dbdelta.com/table-partitioning-best-practices/
Since this is a RANGE LEFT partition scheme you can move all the data off of PRIMARY onto a new filegroup by splitting the rightmost partition at a boundary point greater than the greatest value present in your table.
ALTER PARTITION SCHEME sch_odstable1 NEXT USED [FG_odsvr_2022];
ALTER PARTITION FUNCTION fn_odstable1() SPLIT RANGE (20221231);
The rightmost partition will still be on PRIMARY though. You'll just need to create your future partitions before you need them to keep that partition empty. If you want to you can create a new Partition Scheme
alter database current add filegroup no_files_cant_be_used
CREATE PARTITION SCHEME [sch_odstable2] AS PARTITION [fn_odstable1]
TO ([FG_odsvr_pre_2016], [FG_odsvr_2016], [FG_odsvr_2017], [FG_odsvr_2018], [FG_odsvr_2019], [FG_odsvr_2020], [FG_odsvr_2021], [FG_odsvr_2022], no_files_cant_be_used)
And then create a matching table on the new scheme, ALTER TABLE SWITCH to move all the partitions to the new table, and then rename the tables.
Related
We're using SQL Server 2019. Our fact tables utilize datetime2 but I want to partition on year.
I don't have sysadmin privs so I can't set up different filegroups. I can create partition functions and partition schemes, but it isn't clear to me how to set up the partition scheme so that when I partition the table on ActivityLog for example that it will store entries in their respective year partition.
I've searched the web and haven't found answers as to how it all works.
Partitioning by year on a datetime2 column in a fact table can be a useful technique for managing large data sets, improving query performance, and reducing maintenance costs. Here are the steps to set up partitioning by year:
Define a partition function: A partition function defines the ranges or
boundaries for partitioning the data. In this case, you would define a
partition function that partitions the data by year. For example, the
following code creates a partition function that partitions the data by
year:
CREATE PARTITION FUNCTION pfFactTableByYear (datetime2(0))
AS RANGE RIGHT FOR VALUES
('2010-01-01T00:00:00', '2011-01-01T00:00:00', '2012-01-01T00:00:00', '2013-01-01T00:00:00', '2014-01-01T00:00:00', '2015-01-01T00:00:00', '2016-01-01T00:00:00', '2017-01-01T00:00:00', '2018-01-01T00:00:00', '2019-01-01T00:00:00', '2020-01-01T00:00:00')
Define a partition scheme: A partition scheme maps the partition function to
a set of filegroups. In this case, you would define a partition scheme that
maps the partition function to a set of filegroups. For example, the
following code creates a partition scheme that maps the partition function
to a set of filegroups:
CREATE PARTITION SCHEME psFactTableByYear
AS PARTITION pfFactTableByYear
TO (fg2010, fg2011, fg2012, fg2013, fg2014, fg2015, fg2016, fg2017, fg2018, fg2019, fg2020)
Create the fact table with partitioning: You would create the fact table
with the partition scheme defined in step 2. For example, the following code
creates a fact table with partitioning by year:
CREATE TABLE FactTable
(
Id INT IDENTITY(1,1),
DateColumn datetime2(0) NOT NULL,
ValueColumn decimal(18,2) NOT NULL,
CONSTRAINT PK_FactTable PRIMARY KEY (Id, DateColumn)
)
ON psFactTableByYear(DateColumn)
This creates a fact table with a primary key that includes the partitioning column (DateColumn), and maps the partition scheme to the fact table's data filegroups.
Load data into the fact table: Once the fact table is created, you can load
data into it using standard INSERT statements.
Perform maintenance tasks: As time goes on, new partitions will need to be
created to accommodate new data. You can automate this process using
partition switching or by running a maintenance script that creates new
partitions on a regular basis. You may also want to periodically archive or
remove old data to keep the data set manageable.
Note that partitioning by year is just one option for partitioning a fact table, and the partition function and scheme would need to be adjusted accordingly for other partitioning strategies, such as partitioning by month, quarter, or some other time period.
I have a table which is having weekly partitioned with partition function and scheme defined. The most important thing is this table is having clustered columnstore index with same weekly partition scheme.
So now I have to add few more ranges in partition function and scheme. Which is failing with error saying “cannot alter partition function which is having non empty partition ......... “ where in the data file is of only 4KB with no data loaded.
From one of the post of 2014 Ssms, I came to know that we need to disable clustered index and alter the partition scheme and enable again.
Please help in solving this issue. I’m using 2016 sql and enterprise edition. Thanks in advance.
For columnstore index you need to empty the partition that is going to be split. That can be done by:
moving the data to other partition (by updating its partition key)
altering Partition Schema (with NEXT USED clause) and Partition function (with SPLIT RANGE clause)
moving the data back to correct partition.
Above can be done in one transaction.
For the future, (assuming the data is partitioned by date periods) it's recommended to have a few empty partitions, so a maintenance task/job can automatically split the partitions (and create a few new partitions for future periods) without any issues.
Alternatively you can use ALTER TABLE with SWITCH PARTITION clause, but that approach is less efficient. SWITCH PARTITION is mostly used to quickly delete the old partitions.
I have a huge table with around 110 partitions. I wish to archive the oldest partition and drop the FileGroup. Following is the strategy I adopted.
Created an exact empty table tablename_archive and met all partitioning requirements.
Perform Partition switch
ALTER TABLE tablename SWITCH PARTITION 1 TO tablename_archive PARTITION 1
After verifying the switch (partition swap) , I dropped the archived table.
Merged the Partition function using the first boundary value as follows
ALTER PARTITION FUNCTION YMDatePF2 () MERGE RANGE ('2012-01-01 00:00:00.000')
Although there is no data now on FG, when I try to drop the File or FG it errors out saying.
The file 'XXXXXXXX' cannot be removed because it is not empty.
The filegroup 'XXXXXXXX' cannot be removed because it is not empty.
Is there any change I need to make it to Partition scheme too, after merging the function.
Please let me know if you need any more details.
You can never remove the first (or only) partition from a RANGE RIGHT partition function (or conversely, the last (or only) partition of a RANGE LEFT function). The first (or last if RANGE LEFT) filegroup from the underlying partition schemes can never be removed from the schemes either. Remember you have one more partition, and partition scheme filegroup mapping, than partition boundaries.
If your intent was to archive January 2012 data, you should have switched partition 2 rather than 1 because the first partition contained data less than '2012-01-01 00:00:00.000'. Now that the second partition has been merged, the first partition (and the first filegroup) contains data less than '2012-02-01T00:00:00.000', which includes January 2012 data.
With a RANGE RIGHT sliding window, it is best to plan to keep the first filegroup empty. You could used the PRIMARY filegroup or a dummy one with no files for that purpose. See Table Partitioning Best Practices.
I want to create a sample database using composite partition. I know about Range Partition and List Partition. But, I don't have enough knowledge about Hash Values and how to create Hash Partition in my database?. So, I have decided that I should make a sample database using Composite Partition and I want to use Range Partition and Hash Partition in it. Can anybody describe it more and in easy word so, i can understand well about Hash Partition as well as Composite Partition.
I have also read some documents on internet. But, I could not understand how to create Hash Partition and How to create Composite Partition in my database. Actually I don't have enough knowledge about Hash Value and Hash Functoin. I have read about it but, I could not understand very well. I need a simple definition.
Definition of Horizontal Partition & Vertical Partition
Partition (database)
Hash Functions
Composite Partitioning feature is not available in SQL Server 2008. Only Range Partitioning is available in SQL Server.
Although the partitioning column must be a single column, it does not need to be numeric and it can be calculated so that the range can include multiple columns.
For instance it is common to partition on datetime data by month. This will work well, because that data is usually in a single column, but what do you do if you have data for multiple companies and you also want to partition by company? For this you could use a computed column for the partitioning column. This will create a computed column using the ‘company id’ and ‘order month’ which is then used for the partitions. It will partition three companies for the first three months of 2007.
the computed column must be persisted to form the partitioning column.
CREATE PARTITION FUNCTION MyPartitionRange (INT) AS RANGE LEFT FOR VALUES (1200701,1200702,1200703,2200701,2200702,2200703,3200701,3200702,3200703)
CREATE PARTITION SCHEME MyPartitionScheme AS PARTITION MyPartitionRange ALL TO ([PRIMARY])
CREATE TABLE CompanyOrders
( Company_id INT ,
OrderDate datetime ,
Item_id INT ,
Quantity INT ,
OrderValue decimal(19,5) ,
PartCol AS Company_id * 10000 + CONVERT(VARCHAR(4),OrderDate,112) persisted
) ON MyPartitionScheme (PartCol)
I have a very large table (800GB) which has a DATETIME field which is part of a partition schema. This field is named tran_date. The problem I'm having is that the indexes are not properly aligned with the partition and I can't include the tran_date field in the PRIMARY KEY because it's set to nullable.
I can drop all foreign key relationships, statistics, and indexes, but I can't modify the column because the partition schema is still dependent on the tran_date column.
In my research I've located one way to move the table off of the partition which is to drop the clustered index and then re-write the clustered index onto the PRIMARY filegroup which will then allow me to modify the column, but this operation takes several hours to drop, 13 hours to write the temporary CLUSTERED INDEX on PRIMARY and then I have to drop that, alter the table, and re-write the CLUSTERED INDEX properly which takes another 13 hours. Additionally I have more than one table.
When I tested this deployment in my development environment with a similarly sized data set it took several days to complete, so I'm trying to look for ways to chop down this time.
If I can move the table off the partition without having to write a CLUSTERED INDEX on PRIMARY it would significantly reduce the time required to alter the column.
No matter what, you are going to end up moving data from "point A" (stored in table partitions within the database) to "point B" (not stored within table partitions within the database. The goal is to minimize the number of times you have to work through all that data. Simplest way to do this might be:
Create a new non-partitioned table
Copy the data over to that table
Drop the original table
Rename the new table to the proper name
One problem to deal with is the clustered index. You could either create the new table without the clustered index, copy the data over, and then reindex (extra time and pain), or you could create the table with the clustered index, and copy the data over “in order” (say, low Ids to high). This would be slower than copying it over to a non-clustered table, but it might be faster overall since you wouldn’t then have to build the clustered index.
Of course there's the problem of "what if users change the data while you're copying it"... but table partitioning implies warehousing, so I'm guessing you don't have to worry about that.
A last point, when copying gobs of data, it is best to break the insert into several inserts, so as to not bloat the transaction log.