Optimization for Date Correlation doesn’t change plan - sql-server

I have a reporting requirement from the following tables. I created a new database with these tables and imported data from the live database for reporting purpose.
The report parameter is a date range. I read the following and found that DATE_CORRELATION_OPTIMIZATION can be used to make the query work faster by utilizing seek instead of scan. I made the required settings – still the query is using same old plan and same execution time. What additional changes need to be made to make the query utilize the date correlation?
Note: I am using SQL Server 2005
REFERENCES
Optimizing Queries That Access Correlated datetime Columns
The Query Optimizer: Date Correlation Optimisation
SQL
--Database change made for date correlation
ALTER DATABASE BISourcingTest
SET DATE_CORRELATION_OPTIMIZATION ON;
GO
--Settings made
SET ANSI_NULLS ON
SET ANSI_PADDING ON
SET ANSI_WARNINGS ON
SET ARITHABORT ON
SET CONCAT_NULL_YIELDS_NULL ON
SET QUOTED_IDENTIFIER ON
SET NUMERIC_ROUNDABORT OFF
GO
--Test Setting
IF ( (sessionproperty('ANSI_NULLS') = 1) AND
(sessionproperty('ANSI_PADDING') = 1) AND
(sessionproperty('ANSI_WARNINGS') = 1) AND
(sessionproperty('ARITHABORT') = 1) AND
(sessionproperty('CONCAT_NULL_YIELDS_NULL') = 1) AND
(sessionproperty('QUOTED_IDENTIFIER') = 1) AND
(sessionproperty('NUMERIC_ROUNDABORT') = 0)
)
PRINT 'Everything is set'
ELSE
PRINT 'Different Setting'
--Query
SELECT C.ContainerID, C.CreatedOnDate,OLIC.OrderID
FROM ContainersTest C
INNER JOIN OrderLineItemContainers OLIC
ON OLIC.ContainerID = C.ContainerID
WHERE C.CreatedOnDate > '1/1/2015'
AND C.CreatedOnDate < '2/01/2015'
TABLES
CREATE TABLE [dbo].[ContainersTest](
[ContainerID] [varchar](20) NOT NULL,
[Weight] [decimal](9, 2) NOT NULL DEFAULT ((0)),
[CreatedOnDate] [datetime] NOT NULL DEFAULT (getdate()),
CONSTRAINT [XPKContainersTest] PRIMARY KEY CLUSTERED
(
[CreatedOnDate] ASC,
[ContainerID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[OrderLineItemContainers](
[OrderID] [int] NOT NULL,
[LineItemID] [int] NOT NULL,
[ContainerID] [varchar](20) NOT NULL,
[CreatedOnDate] [datetime] NOT NULL DEFAULT (getdate()),
CONSTRAINT [PK_POLineItemContainers] PRIMARY KEY CLUSTERED
(
[OrderID] ASC,
[LineItemID] ASC,
[ContainerID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
CONSTRAINT [IX_OrderLineItemContainers] UNIQUE NONCLUSTERED
(
[ContainerID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
ALTER TABLE [dbo].[OrderLineItemContainers] WITH CHECK ADD CONSTRAINT [FK_POLineItemContainers_Containers] FOREIGN KEY([ContainerID])
REFERENCES [dbo].[Containers] ([ContainerID])
GO
ALTER TABLE [dbo].[OrderLineItemContainers] CHECK CONSTRAINT [FK_POLineItemContainers_Containers]
Plan
--

According to the docs:
https://technet.microsoft.com/en-us/library/ms177416(v=sql.105).aspx
If any one of the datetime columns for which correlation statistics are maintained is not the first or only key of a clustered index, consider creating a clustered index on it. Doing this generally leads to better performance on the types of queries covered by correlation statistics. If a clustered index already exists on the primary key columns, you can modify a table so that the clustered index and primary key use different column sets.
Since your OrderLineItemContainers table has no suitable index by which to filter on the Date, it really can't do anything. Try adding a nonclustered index on the OrderLineItemContainers.CreatedOnDate to see if it will then switch the plan.
It would be better to have it be clustered, but there are other considerations... note you could make the primary key nonclustered, and use the clustered for this new date index if this is the dominant query and this makes it worth it.
So this is optimal:
CREATE TABLE [dbo].[OrderLineItemContainers](
[OrderID] [int] NOT NULL,
[LineItemID] [int] NOT NULL,
[ContainerID] [varchar](20) NOT NULL,
[CreatedOnDate] [datetime] NOT NULL DEFAULT (getdate()),
CONSTRAINT [PK_POLineItemContainers] PRIMARY KEY NONCLUSTERED -- NONCLUSTERED PRIMARY KEY!!
(
[OrderID] ASC,
[LineItemID] ASC,
[ContainerID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
CONSTRAINT [IX_OrderLineItemContainers] UNIQUE NONCLUSTERED
(
[ContainerID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE CLUSTERED INDEX ON OrderLineItemContainers(CreatedOnDate)
OR you could just try a new NONCLUSTERED index:
CREATE NONCLUSTERED INDEX ON OrderLineItemContainers(CreatedOnDate)

Related

AspNetUserLogins table and maximum size of index keys in SQL Server

The schema of the identity model in VS2017/aspnetcore defines a table called AspNetUserLogins table to store external logins (CREATE statement below). It defines the primary key as a composite of [LoginProvider] [nvarchar] (450) and [ProviderKey] [nvarchar] (450). The SQL server limits for the maximum size of index keys is specified at 900 bytes here. A note on that page specifically says
"If a table column is a Unicode data type such as nchar or nvarchar,
the column length displayed is the storage length of the column. This
is two times the number of characters specified in the CREATE TABLE
statement. In the previous example, City is defined as an nvarchar(30)
data type; therefore, the storage length of the column is 60."
So is this key not twice the allowed size?
Sql Server Management Studio seems to think so....
Warning! The maximum key length for a clustered index is 900 bytes.
The index 'PK_AspNetUserLogins' has maximum length of 1800 bytes. For
some combination of large values, the insert/update operation will
fail.
CREATE TABLE [dbo].[AspNetUserLogins](
[LoginProvider] [nvarchar](450) NOT NULL,
[ProviderKey] [nvarchar](450) NOT NULL,
[ProviderDisplayName] [nvarchar](max) NULL,
[UserId] [nvarchar](450) NOT NULL,
CONSTRAINT [PK_AspNetUserLogins] PRIMARY KEY CLUSTERED
(
[LoginProvider] ASC,
[ProviderKey] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
Looks like they know...issue1451
It looks as though this will cause subsequent issues. I originally created my database on my desktop prior to deploying it to Azure and there is a significant difference between the 2 databases. In SSMS, using the "Script Table as > CREATE table", the tables designs are:
Azure database:
CREATE TABLE [dbo].[AspNetUserLogins](
[LoginProvider] [nvarchar](225) NOT NULL,
[ProviderKey] [nvarchar](225) NOT NULL,
[ProviderDisplayName] [nvarchar](max) NULL,
[UserId] [nvarchar](450) NOT NULL,
CONSTRAINT [PK_AspNetUserLogins] PRIMARY KEY CLUSTERED
(
[LoginProvider] ASC,
[ProviderKey] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
)
Desktop database:
CREATE TABLE [dbo].[AspNetUserLogins](
[LoginProvider] [nvarchar](450) NOT NULL,
[ProviderKey] [nvarchar](450) NOT NULL,
[ProviderDisplayName] [nvarchar](max) NULL,
[UserId] [nvarchar](450) NOT NULL,
CONSTRAINT [PK_AspNetUserLogins] PRIMARY KEY CLUSTERED
(
[LoginProvider] ASC,
[ProviderKey] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
Note the [PRIMARY] references, I cannot get these into Azure. This results in the following error from a: MVC Net core 2 website using Microsoft.AspNetCore.Identity;
MVC Net Core 2.0 error resulting from the inability to add primary clustered keys

violation of primary key constraint in insert query not touching the PK column

I have a query that inserts record in a table. the primary key column of that table is an Identity field that auto-increments. the select part of the query will have duplicates, but I have an an unique constraint with ignore_dup_key=on on fields (city_nm, prov_en_nm) that should skip them on insert. this used to work fine, but for some reason now it gives me this message. this is the first time I try it since the database was moved from a 2012 sql server to a 2014 if that can have an impact
Violation of PRIMARY KEY constraint 'Dim_city_province_country_pk'. Cannot insert duplicate key in object 'HD_DtlClm.dim_city_province_country_t'. The duplicate key value is (###). (where ### is an ID, a different one every time I run it)
Here is the query.
INSERT INTO HD_DtlClm.[dim_city_province_country_t] (
city_nm, prov_en_nm, prov_fr_nm, contry_fr_nm, contry_en_nm
)
SELECT gr_mbr_city_nm, PROV_ENG_NM, PROV_FR_NM, CONTRY_ENG_NM, CONTRY_FR_NM
FROM isu.gr_dentl_clm_v
LEFT JOIN HD_DtlClm.province_information_t
ON gr_dentl_clm_v.gr_mbr_prov_cd = HD_DtlClm.province_information_t.PROV_CLM_CD
UNION
SELECT gr_prvdr_city_nm, PROV_ENG_NM, PROV_FR_NM, CONTRY_ENG_NM, CONTRY_FR_NM
FROM isu.gr_dentl_clm_v
LEFT JOIN HD_DtlClm.province_information_t
ON gr_dentl_clm_v.gr_prvdr_prov_cd IN (HD_DtlClm.province_information_t.PROV_ENG_CD, HD_DtlClm.province_information_t.PROV_CLM_CD)
Any idea why I get this error that I didn't get in the past?
EDIT to add primary key creation script:
ALTER TABLE [HD_DtlClm].[dim_city_province_country_t] ADD CONSTRAINT [Dim_city_province_country_pk] PRIMARY KEY CLUSTERED
( [cpc_key] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
EDIT2 to add table creation script
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [HD_DtlClm].[dim_city_province_country_t](
[cpc_key] [int] IDENTITY(1,1) NOT NULL,
[city_nm] [char](50) NOT NULL,
[prov_en_nm] [char](50) NULL,
[prov_fr_nm] [char](50) NULL,
[contry_en_nm] [char](75) NULL,
[contry_fr_nm] [char](75) NULL,
[create_ts] [datetime] NOT NULL,
[update_ts] [datetime] NOT NULL,
CONSTRAINT [Dim_city_province_country_pk] PRIMARY KEY CLUSTERED
(
[cpc_key] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
CONSTRAINT [dim_city_province_country_ak1] UNIQUE NONCLUSTERED
(
[city_nm] ASC,
[prov_en_nm] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = ON, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
ALTER TABLE [HD_DtlClm].[dim_city_province_country_t] ADD DEFAULT (getdate()) FOR [create_ts]
GO
ALTER TABLE [HD_DtlClm].[dim_city_province_country_t] ADD DEFAULT (getdate()) FOR [update_ts]
GO
Try running: DBCC CHECKIDENT ('HD_DtlClm.[dim_city_province_country_t]'); look at the results returned in the messages tab & make sure the current identity value is equal to or higher than the current column value. NB running this may even fix the problem itself.
To expand: looks like something had reseeded your identity column, so the insert was causing duplicates to be picked up. Don't think there's any way to check historically what changed it; the most likely candidates are the DBCC CHECKIDENT command with RESEED option, or a TRUNCATE operation (will reseed to the original value).

Updating a table after adding Index

I am designing a database using SQLExpress.
I have a table which has three columns. The table looks as below.
CREATE TABLE [dbo].[dummy](
[id] [int] IDENTITY(1,1) NOT NULL,
[someLongString] [text] NOT NULL,
[someLongText_Hash] [binary](20) NOT NULL,
CONSTRAINT [PK_dummy] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
I already have some data in this table. Whenever I want to add a new row, I first compute a hash on someLongString and query the table to see if a row with this hash already exists. As the table size grows, this query talks longer time and hence I plan to index it by the someLongText_Hash column.
Can some please suggest how to do this in SQL Server Management Studio. Also, after adding this index, how do I index the existing rows in this table ?
Why can't you just set the 'someLongString' field to be unique? That way you don't need to keep a hash and an extra primary key?
You could try using a CHECKSUM.
CREATE TABLE [dbo].[dummy](
[id] [int] IDENTITY(1,1) NOT NULL,
[someLongString] [text] NOT NULL,
[someLongText_CheckSum] NOT NULL,
CONSTRAINT [UC_someLongText_CheckSum] UNIQUE (someLongText_CheckSum),
CONSTRAINT [PK_dummy] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
See here for further explanation

primary key name is required field?

Is there any difference between the below 2 CREATE TABLE statements in SQL Server 200x/2012? I generated this script from two different tables, one had a Key name defined (PK_Table1) whereas the other had some kind of randomly generated number associated to it (PK_Table1_1084F446).
CREATE TABLE [dbo].[Table1](
[ID] [uniqueidentifier] NOT NULL,
<<Other Column declaration here>>
PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
Few more non-clustered indexes declaration here
CREATE TABLE [dbo].[Table1](
[ID] [uniqueidentifier] NOT NULL,
<<Other Column declaration here>>
CONSTRAINT [PK_Table1] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
Few more non-clustered indexes declaration here
It works in the same way, but natural names are more convenient:
1) when altering constraint you can easy refer to it (if you gave sensible name);
2) when query failed due to constraint, name of this constraint is showed, so you can easily know what cause an error (if you gave sensible name).

aspnet_Users table with huge indexsize

We have an aspnet_Users table from aspnet membership table that shows up with
almost 18 gb index size.
rows 251172
datasize 56472 KB
indexsize 17800536 KB
This is just the standard aspnet membership table, but we do have an other table with a foreign key to this table (userid column).
Anyone seen this problem before?
How can i reduce the index size?
the aspnet_Users table is defined as
CREATE TABLE [dbo].[aspnet_Users](
[ApplicationId] [uniqueidentifier] NOT NULL,
[UserId] [uniqueidentifier] NOT NULL,
[UserName] [nvarchar](256) NOT NULL,
[LoweredUserName] [nvarchar](256) NOT NULL,
[MobileAlias] [nvarchar](16) NULL,
[IsAnonymous] [bit] NOT NULL,
[LastActivityDate] [datetime] NOT NULL,
PRIMARY KEY NONCLUSTERED
(
[UserId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE UNIQUE CLUSTERED INDEX [aspnet_Users_Index] ON [dbo].[aspnet_Users]
(
[ApplicationId] ASC,
[LoweredUserName] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [aspnet_Users_Index2] ON [dbo].[aspnet_Users]
(
[ApplicationId] ASC,
[LastActivityDate] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
ALTER TABLE [dbo].[aspnet_Users] ADD DEFAULT (newid()) FOR [UserId]
GO
ALTER TABLE [dbo].[aspnet_Users] ADD DEFAULT (NULL) FOR [MobileAlias]
GO
ALTER TABLE [dbo].[aspnet_Users] ADD DEFAULT ((0)) FOR [IsAnonymous]
GO
ALTER TABLE [dbo].[aspnet_Users] WITH NOCHECK ADD FOREIGN KEY([ApplicationId])
REFERENCES [dbo].[aspnet_Applications] ([ApplicationId])
GO
This works out at 70k per row though which is more then a single extent (8 pages, 64k) which implies massive fragmentation.
Have you ever run index maintenance on it? Run this and see what happens
ALTER INDEX ALL ON aspnet_Users REBUILD
Alternatively, has the table ever been extended with a LOB (pictures, XML etc) column? Or has someone added dozens of indexes? So please add the actual in-use table definition
Edit: remove the index on LastActivityDate, or change it to smalldatetime to keep minute accuracy, or only update it if changed more then xx seconds/minutes

Resources