My SQL Server database includes some tables partitioned by month. The partition scheme and function are set to the 20191201 right limit. My partition scheme uses separate file groups for each partition. I now need to extend these before the end of the year (last partition key on the right is N'20191231' and last file group FG_2_201912).
Question #1: do I need to repeat ALTER PARTITION SCHEME [PartitionByPeriodScheme] NEXT USED [FG_2_202001]; for each file group until [FG_2_202012]? I sure can write a script which will produce the command dynamically but is there any way to add all file groups with one command?
Question #2: do I need to repeat ALTER PARTITION FUNCTION [PartitionByPeriodFunction]() SPLIT RANGE 20200131 for each partition key value until 20201231? Do I really need to split range since there are no data in the last right partition yet? Are there any alternatives?
Related
We're using SQL Server 2019. Our fact tables utilize datetime2 but I want to partition on year.
I don't have sysadmin privs so I can't set up different filegroups. I can create partition functions and partition schemes, but it isn't clear to me how to set up the partition scheme so that when I partition the table on ActivityLog for example that it will store entries in their respective year partition.
I've searched the web and haven't found answers as to how it all works.
Partitioning by year on a datetime2 column in a fact table can be a useful technique for managing large data sets, improving query performance, and reducing maintenance costs. Here are the steps to set up partitioning by year:
Define a partition function: A partition function defines the ranges or
boundaries for partitioning the data. In this case, you would define a
partition function that partitions the data by year. For example, the
following code creates a partition function that partitions the data by
year:
CREATE PARTITION FUNCTION pfFactTableByYear (datetime2(0))
AS RANGE RIGHT FOR VALUES
('2010-01-01T00:00:00', '2011-01-01T00:00:00', '2012-01-01T00:00:00', '2013-01-01T00:00:00', '2014-01-01T00:00:00', '2015-01-01T00:00:00', '2016-01-01T00:00:00', '2017-01-01T00:00:00', '2018-01-01T00:00:00', '2019-01-01T00:00:00', '2020-01-01T00:00:00')
Define a partition scheme: A partition scheme maps the partition function to
a set of filegroups. In this case, you would define a partition scheme that
maps the partition function to a set of filegroups. For example, the
following code creates a partition scheme that maps the partition function
to a set of filegroups:
CREATE PARTITION SCHEME psFactTableByYear
AS PARTITION pfFactTableByYear
TO (fg2010, fg2011, fg2012, fg2013, fg2014, fg2015, fg2016, fg2017, fg2018, fg2019, fg2020)
Create the fact table with partitioning: You would create the fact table
with the partition scheme defined in step 2. For example, the following code
creates a fact table with partitioning by year:
CREATE TABLE FactTable
(
Id INT IDENTITY(1,1),
DateColumn datetime2(0) NOT NULL,
ValueColumn decimal(18,2) NOT NULL,
CONSTRAINT PK_FactTable PRIMARY KEY (Id, DateColumn)
)
ON psFactTableByYear(DateColumn)
This creates a fact table with a primary key that includes the partitioning column (DateColumn), and maps the partition scheme to the fact table's data filegroups.
Load data into the fact table: Once the fact table is created, you can load
data into it using standard INSERT statements.
Perform maintenance tasks: As time goes on, new partitions will need to be
created to accommodate new data. You can automate this process using
partition switching or by running a maintenance script that creates new
partitions on a regular basis. You may also want to periodically archive or
remove old data to keep the data set manageable.
Note that partitioning by year is just one option for partitioning a fact table, and the partition function and scheme would need to be adjusted accordingly for other partitioning strategies, such as partitioning by month, quarter, or some other time period.
What steps to take to add additional partitions to the end of an already partitioned table in SQL Server?
Conditions:
The Partition Function is Right Range.
Table considers as a VLTB.
No DB downtime is acceptable (<10min).
Also, How to verify the partitions and rows are correctly mapped?
Addressing your questions in turn:
What steps to take to add additional partitions to the end of an already partitioned table in SQL Server?
Partitioned tables are built on partition schemes which themselves are built on partition functions. Partition functions explicitly specify partition boundaries which implicitly define the partitions. To add a new partition to the table, you need to alter the partition function to add a new partition boundary. The syntax for that is alter partition function... split. For example, let's say that you have an existing partition function on a datetime data type that defines monthly partitions.
CREATE PARTITION FUNCTION PF_Monthly(datetime)
AS RANGE RIGHT FOR VALUES (
'2022-10-01',
'2022-11-01',
'2022-12-01',
'2023-01-01'
);
Pausing there and talking about the last two partitions in the current setup. The next-to-last partition is defined as 2022-12-01 <= x < 2023-01-01 while the last partition is defined as 2023-01-01 <= x. Which is to say that the next-to-last partition is bounded for the month of December 2022, the last partition is unbounded on the high side and includes data for January 2023 but also anything larger.
If you want to bound the last partition to just January 2023, you'll add a partition boundary to the function for the high side of that partition. There's a small catch in that you'll also need to alter the partition scheme to tell SQL where to put data, but that's a small thing.
ALTER PARTITION SCHEME PS_Monthly
NEXT USED someFileGroup;
ALTER PARTITION FUNCTION PF_Monthly()
SPLIT RANGE ('2023-02-01');
At this point, what used to be your highest partition is now defined as 2023-01-01 <= x < 2023-02-01 and the highest partition is defined as 2023-02-01 <= x. I should note that adding a boundary to a partition function will affect all tables that use it. When I was using table partitioning at a previous job, I had a rule to have only one table using a given partition function (even if they were logically equivalent).
No DB downtime is acceptable (<10min)
The above exposition doesn't mention one important point - if there is data in either side of the new boundary, a new B-tree is going to be built for it (which is a size-of-data operation). There's more on that in the documentation. To keep that at a minimum, I like to keep two empty partitions at the end of the scheme. Using my above example, that would mean that I'd have added the January partition boundary in November. By doing it this way, you have some leeway in when the actual partition split happens (i.e. if it's a bit late, you're not accidentally incurring data movement). I'd also put in monitoring that's something along the lines of "if the highest partition boundary is less than 45 days away, alert". A slightly more sophisticated but more correct alert would be "if there is data in the second to last partition, send an alert".
Also, How to verify the partitions and rows are correctly mapped?
You can query the DMVs for this. I like using the script in this blog post. There's also the $PARTITION() function if you want to see which partition specific rows in your table belong to.
I have a huge table which is partitioned by date.
We have 8 partitions all on different file groups, with one of these file groups being PRIMARY.
I would like to replace the PRIMARY file group with a new file group called 'FG_odsvr_misc', and remove PRIMARY from the partition schema.
How would i achieve this without creating a new table with a new partition function?
The boundaries look like below -
The partition function is as below -
CREATE PARTITION FUNCTION [fn_odstable1](numeric(9,0))
AS RANGE LEFT FOR VALUES (20151231, 20161231, 20171231, 20181231, 20191231, 20201231, 20211231)
The partition scheme is as below -
CREATE PARTITION SCHEME [sch_odstable1] AS PARTITION [fn_odstable1]
TO ([FG_odsvr_pre_2016], [FG_odsvr_2016], [FG_odsvr_2017], [FG_odsvr_2018], [FG_odsvr_2019], [FG_odsvr_2020], [FG_odsvr_2021], [PRIMARY])
Ok. The partition you have on the PRIMARY filegroup is the so-called "Permanent Partition"
From Dan Guzman's Table Partitioning Best Practices:
You might not be aware that each partition scheme has a permanent
partition that can never be removed. This is the first partition of a
RANGE RIGHT function and the last partition of a RANGE LEFT one. Be
mindful of this permanent partition when creating a new partition
scheme when multiple filegroups are involved because the filegroup on
which this permanent partition is created is determined when the
partition scheme is created and cannot be removed from the scheme.
. . .
Consider mapping partitions containing data outside the expected range
to a dummy filegroup with no underlying files. This will guarantee
data integrity much like a check constraint because data outside the
allowable range cannot be inserted. If you must accommodate errant
data rather than rejecting it outright, instead map these partitions
to a generalized filegroup like DEFAULT or one designated specifically
for that purpose.
http://www.dbdelta.com/table-partitioning-best-practices/
Since this is a RANGE LEFT partition scheme you can move all the data off of PRIMARY onto a new filegroup by splitting the rightmost partition at a boundary point greater than the greatest value present in your table.
ALTER PARTITION SCHEME sch_odstable1 NEXT USED [FG_odsvr_2022];
ALTER PARTITION FUNCTION fn_odstable1() SPLIT RANGE (20221231);
The rightmost partition will still be on PRIMARY though. You'll just need to create your future partitions before you need them to keep that partition empty. If you want to you can create a new Partition Scheme
alter database current add filegroup no_files_cant_be_used
CREATE PARTITION SCHEME [sch_odstable2] AS PARTITION [fn_odstable1]
TO ([FG_odsvr_pre_2016], [FG_odsvr_2016], [FG_odsvr_2017], [FG_odsvr_2018], [FG_odsvr_2019], [FG_odsvr_2020], [FG_odsvr_2021], [FG_odsvr_2022], no_files_cant_be_used)
And then create a matching table on the new scheme, ALTER TABLE SWITCH to move all the partitions to the new table, and then rename the tables.
I have a table which is having weekly partitioned with partition function and scheme defined. The most important thing is this table is having clustered columnstore index with same weekly partition scheme.
So now I have to add few more ranges in partition function and scheme. Which is failing with error saying “cannot alter partition function which is having non empty partition ......... “ where in the data file is of only 4KB with no data loaded.
From one of the post of 2014 Ssms, I came to know that we need to disable clustered index and alter the partition scheme and enable again.
Please help in solving this issue. I’m using 2016 sql and enterprise edition. Thanks in advance.
For columnstore index you need to empty the partition that is going to be split. That can be done by:
moving the data to other partition (by updating its partition key)
altering Partition Schema (with NEXT USED clause) and Partition function (with SPLIT RANGE clause)
moving the data back to correct partition.
Above can be done in one transaction.
For the future, (assuming the data is partitioned by date periods) it's recommended to have a few empty partitions, so a maintenance task/job can automatically split the partitions (and create a few new partitions for future periods) without any issues.
Alternatively you can use ALTER TABLE with SWITCH PARTITION clause, but that approach is less efficient. SWITCH PARTITION is mostly used to quickly delete the old partitions.
I have a huge table with around 110 partitions. I wish to archive the oldest partition and drop the FileGroup. Following is the strategy I adopted.
Created an exact empty table tablename_archive and met all partitioning requirements.
Perform Partition switch
ALTER TABLE tablename SWITCH PARTITION 1 TO tablename_archive PARTITION 1
After verifying the switch (partition swap) , I dropped the archived table.
Merged the Partition function using the first boundary value as follows
ALTER PARTITION FUNCTION YMDatePF2 () MERGE RANGE ('2012-01-01 00:00:00.000')
Although there is no data now on FG, when I try to drop the File or FG it errors out saying.
The file 'XXXXXXXX' cannot be removed because it is not empty.
The filegroup 'XXXXXXXX' cannot be removed because it is not empty.
Is there any change I need to make it to Partition scheme too, after merging the function.
Please let me know if you need any more details.
You can never remove the first (or only) partition from a RANGE RIGHT partition function (or conversely, the last (or only) partition of a RANGE LEFT function). The first (or last if RANGE LEFT) filegroup from the underlying partition schemes can never be removed from the schemes either. Remember you have one more partition, and partition scheme filegroup mapping, than partition boundaries.
If your intent was to archive January 2012 data, you should have switched partition 2 rather than 1 because the first partition contained data less than '2012-01-01 00:00:00.000'. Now that the second partition has been merged, the first partition (and the first filegroup) contains data less than '2012-02-01T00:00:00.000', which includes January 2012 data.
With a RANGE RIGHT sliding window, it is best to plan to keep the first filegroup empty. You could used the PRIMARY filegroup or a dummy one with no files for that purpose. See Table Partitioning Best Practices.