I'm using nHibnerate in my web application and I have a problem using indexes in generated sp_execute. My table has 210 millions records and the query is very slow.
Firstly, there was a problem with generated column 'kolumna1' type. In database I have a column of varchar but nHibernate generated nvarchar. I workarounded this by putting special attribute in the code which forced using varchar. After that trick sp_executed started using indexes and everything was correct. Now the problem is back sp_executesql takes 10 minutes to finish. When i checked normal query(without sp_executesql) it took only 1s. I checked execution plans for both: sp_executesql wasn't using index and normal query was using index. Without changing index i modified back varchar to nvarchar and sp_execute finished in 1s (used index). Anyone got any idea where did i make a mistake ? why the execution plan is diffrent for such small changes? And how to fix it?
Here i attached more code. Just in case if someone need it.
sp_executesql with varchar(8000)
exec sp_executesql N'SELECT count(*) as y0_ FROM tabela1 this_ WHERE ((this_.kolumna2 >= #p0 and this_.kolumna2 <= #p1)) and
(this_.kolumna3 in (#p2, #p3) and this_.kolumna1 like #p4)',N'#p0 datetime,#p1 datetime,#p2 int,#p3 int,#p4 varchar(8000)',
#p0='2013-01-08 14:38:00' ,#p1='2013-02-08 14:38:00',#p2=341,#p3=342,#p4='%501096109%'
sp_executesql with nvarchar(4000)
exec sp_executesql N'SELECT count(*) as y0_ FROM tabela1 this_ WHERE ((this_.kolumna2 >= #p0 and this_.kolumna2 <= #p1)) and
(this_.kolumna3 in (#p2, #p3) and this_.kolumna1 like #p4)',N'#p0 datetime,#p1 datetime,#p2 int,#p3 int,#p4 nvarchar(4000)',
#p0='2013-01-08 14:38:00' ,#p1='2013-02-08 14:38:00',#p2=341,#p3=342,#p4='%501096109%'
The funny part is that in sql profiler both query gives same reuslt:
exec sp_executesql N'SELECT count(*) as y0_ FROM tabela1 this_
WHERE this_.kolumna3 in (#p2, #p3) and ((this_.kolumna2 >= #p0 and this_.kolumna2 <= #p1))
and ( this_.kolumna1 like #p4)',N'#p0 datetime,#p1 datetime,#p2 int,#p3 int,#p4 varchar(8000)',
#p0='2013-01-08 14:38:00' ,#p1='2013-02-08 14:38:00',#p2=341,#p3=342,#p4='%501096109%'
--Declare #p0 datetime
--set #p0 = '2013-01-08 14:38:00'
--Declare #p1 datetime
--set #p1 = '2013-02-08 14:38:00'
--Declare #p2 int
--set #p2 = 341
--Declare #p3 int
--set #p3 = 342
--Declare #p4 varchar(8000)
--set #p4 = '%501096109%'
--SELECT count(*) as y0_
--FROM tabela1 this_
--WHERE ((this_.kolumna2 >= #p0 and
--this_.kolumna2 <= #p1)) and
--(this_.kolumna3 in (#p2, #p3) and this_.kolumna1 like #p4)
Here are indexes:
CREATE TABLE [dbo].[tabela1](
[id] [bigint] NOT NULL,
[kolumna1] [varchar](128) NOT NULL,
[kolumna2] [datetime] NOT NULL,
[kolumna3] [int] NOT NULL,
CONSTRAINT [PK__tabela1__4F7CD00D] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [ind_tabela1_ kolumna2] ON [dbo].[tabela1]
(
[kolumna2] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [ind_ tabela1_ kolumna3] ON [dbo].[ tabela1]
(
[kolumna3] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_ tabela1_ kolumna1] ON [dbo].[ tabela1]
(
[kolumna1] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_ tabela1_ kolumna2_ kolumna3] ON [dbo].[ tabela1]
(
[kolumna2] ASC,
[kolumna3] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_ tabela1_ kolumna3_ kolumna2_id_ kolumna1] ON [dbo].[ tabela1]
(
[kolumna3] ASC,
[kolumna2] ASC,
[id] ASC,
[kolumna1] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
Below execution plan for query: select count(*) from [dbo].[tabela1] where [kolumna1] like N'%501096109%'
Sql Server query optimizer can choose to use index seek when:
There are another filter predicates besides LIKE. It should be a precise search or at least SARGable predicate
Table is very large (millions of rows)
But seek operation cannot be done when explicit type conversion is used - different collation/datatype.
Another thing that you cannot control this behavior and query plans can be vary for different predicate sets. To do this, you need to use hint FORCESEEK (version 2008+). You can find information here:
http://msdn.microsoft.com/en-us/library/ms187373%28v=sql.100%29.aspx
Could you try this:
(1) Run the following SQL:
select * from sys.dm_exec_cached_plans
cross apply sys.dm_exec_sql_text(plan_handle) t
(2) Use the last column to find the SQL for the first query. It will not contain sp_executesql, but will start with your list of parameters, the last one being a varchar. Get the plan_handle, and use it in the following statement:
dbcc freeproccache (<your_plan_handle>)
Then retry query 1.
Related
I have problem with this same query on different instances of SQL Server on-premise (dev and prod). This same configuration of indexes/partitions on both.
I do not know why this from dev server works much faster than this on prod. I did notice here that dev execution plan has a Key lookup operator related to nested loop. Just can't trigger prod server to take into account key lookup also. How I can force this same on prod?
DEV :
PROD :
Query:
WITH CTE AS
(
SELECT
B.CELL_VALUE_NET, B.CELL_VALUE_NET_NEGATIVE,
ROW_NUMBER() OVER (PARTITION BY B.CHASSI_ID, B.LOG_ID, B.CELL_NO ORDER BY B.CHASSI_ID, B.LOG_ID, B.CELL_NO, B.READING_DATE) ROW_ID,
B.CELL_VALUE - LAG(B.CELL_VALUE, 1) OVER (PARTITION BY B.CHASSI_ID ORDER BY B.CHASSI_ID, B.LOG_ID, B.CELL_NO, B.READING_DATE) CELL_VALUE_NET_NEW,
b.log_id, b.reading_date, b.cell_no
FROM
REL.TEMP_CHASSI_LAST_LOAD A
JOIN
REL.MACHINE_READING_CELL B WITH (NOLOCK) ON A.CHASSI_ID = B.CHASSI_ID
AND B.ROW_CREATION_DATE BETWEEN A.MIN_ROW_CREATION_DATE AND A.MAX_ROW_CREATION_DATE
WHERE
1 = 1
AND A.CHASSI_ID IN ('A30F012437', 'A30F012546', 'A30F012545', 'A30F012558', 'A30F012657', 'A30F082351', 'A30F082332', 'A30F082325', 'A30F082290')
)
SELECT
*
-- CELL_VALUE_NET = IIF(CELL_VALUE_NET_NEW < 0,0,CELL_VALUE_NET_NEW),
--CELL_VALUE_NET_NEGATIVE = IIF(CELL_VALUE_NET_NEW < 0, CELL_VALUE_NET_NEW,NULL)
FROM
CTE
WHERE
1 = 1
AND ROW_ID > 1
Data in partitions:
All indexes are this same on both environments :
-- additional for later processes update index
CREATE NONCLUSTERED INDEX [REL.MACHINE_READING_CELL_NCI_CHASSI_ID_CELL_VALUE_CELL_VALUE_NET]
ON [REL].[MACHINE_READING_CELL] ([CHASSI_ID] ASC, [LOG_ID] ASC, [CELL_NO] ASC, [READING_DATE] ASC)
INCLUDE ([CELL_VALUE], [CELL_VALUE_NET])
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
-- partitioned indexes :
ALTER TABLE [REL].[MACHINE_READING_CELL]
ADD CONSTRAINT [REL.MACHINE_READING_CELL_PK]
PRIMARY KEY CLUSTERED ([ROW_CREATION_DATE] ASC, [CHASSI_ID] ASC, [READING_DATE] ASC, [LOG_TYPE] ASC, [LOG_ID] ASC, [CELL_NO] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,
ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
-- foreign keys:
ALTER TABLE [REL].[MACHINE_READING_CELL] WITH CHECK
ADD CONSTRAINT [REL.LOG_REL.MACHINE_READING_CELL_FK1]
FOREIGN KEY([LOG_ID]) REFERENCES [REL].[LOG] ([LOG_ID])
GO
ALTER TABLE [REL].[MACHINE_READING_CELL] CHECK CONSTRAINT [REL.LOG_REL.MACHINE_READING_CELL_FK1]
GO
-- [REL].[TEMP_CHASSI_LAST_LOAD]
CREATE CLUSTERED INDEX [IDX_MR_CELL]
ON [REL].[TEMP_CHASSI_LAST_LOAD] ([CHASSI_ID] ASC, [MIN_ROW_CREATION_DATE] ASC, [MAX_ROW_CREATION_DATE] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
Query plans :
PROD : https://www.brentozar.com/pastetheplan/?id=Hyo3bf6ac
DEV : https://www.brentozar.com/pastetheplan/?id=H1qUMG6a9
I don't understand what is happening here. I am querying a single table as seen by my query below. I am only fetching the first 20 records yet the query is takes 24 seconds to complete.
Is there any way to speed up this paging query?
;WITH TempResult AS(
SELECT distinct
D.GLCompany
,D.GLAcct
,D.GLProdNum
,D.GLCostCenter
,D.FCSCompany
,D.FCSAcct
,D.FCSCostCenter
,D.JournalDetailId
,D.[EffDt]
,D.[JournalLineAmt]
,D.[JournalLineDesc]
,D.[ManagedByCd]
,D.[LegalOwnerId]
,D.[JournalLineNum]
,D.[RoundedFlagBit]
,D.[CLPreValErrCd]
,D.[GLPreValErrCd]
,D.[SuspenseErrCd]
,D.GLProfitCenter
,D.GLTradingPartner
,D.GLInternalOrder
,D.GLSubAcct
,D.GLAcctActivity
,D.GLDataSrc
,D.GLId
,D.GLProdGrp
,D.HeaderId
from MyDetail D
)
SELECT * FROM TempResult
ORDER BY TempResult.HeaderId
OFFSET 0 ROWS
FETCH NEXT 20 ROWS ONLY
OPTION(RECOMPILE)
There is a non clustered index on headerid as seen below
CREATE NONCLUSTERED INDEX [FCSAcctJournalDetail_idx] ON [dbo].[MyDetail]
(
[FCSAcct] ASC,
[FCSCompany] ASC,
[JournalEntryEffDt] ASC,
[DataDt] ASC,
[HeaderId] ASC,
[JournalDetailId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
Add an index on HeaderId:
CREATE NONCLUSTERED INDEX [FCSAcctJournalDetail_HeaderId_idx] ON [dbo].[MyDetail]
(
[HeaderId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
As David Browne wrote in his comment - the index you currently have is irrelevant to this query.
If the HeaderId was the first column in the index it would be relevant, but since it's not the first (and not even close to being the first), it's simply irrelevant in the context of this query.
I have a table with 24 milion rows.
I want to run this query:
select r1.userID, r2.userID, sum(r1.rate * r2.rate) as sum
from dbo.Ratings as r1
join dbo.Ratings as r2
on r1.movieID = r2.movieID
where r1.userID <= r2.userID
group by r1.userID, r2.userID
As I tested, it took 24 hours to produce 0.02 percent of the final result.
How can I speed it up?
Here is the definition of the table:
CREATE TABLE [dbo].[Ratings](
[userID] [int] NOT NULL,
[movieID] [int] NOT NULL,
[rate] [real] NOT NULL,
PRIMARY KEY CLUSTERED
(
[userID] ASC,
[movieID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [IX_RatingsMovies] ON [dbo].[Ratings]
(
[movieID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [IX_RatingsUsers] ON [dbo].[Ratings]
(
[userID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
Here is the execution plan:
The workaround I suggested was to create a "reverse" index:
CREATE INDEX IX_Ratings_Reverse on Ratings(movieid, userid) include(rate);
and then force SQL Server to use it:
select r1.userID, r2.userID, sum(r1.rate * r2.rate) as sum
from dbo.Ratings as r1 join dbo.Ratings as r2
with (index(IX_Ratings_Reverse))
on r1.movieID = r2.movieID
where r1.userID <= r2.userID group by r1.userID, r2.userID
There are two things that might help.
1) Change the order of columns in your clustered index to MovieID,UserID. This would group all the same MovieID's together first, which might change your Hash Match to an Inner Loop, and improve the performance of the JOIN.
2) Change the [IX_RatingsMovies] index to INCLUDE UserID and Rate. The more I think about it, I think this is less likely than my first suggestion to help. But it's possible.
I am currently exploring the SQL server XML column and Selective indexes for our needs. For doing so I created table called Incidents and created the Selective Index and Secondary selective Indexes (scripts below).
When I run the following query it does use the selective index but the query plan does the IS NOT NULL predicate on Severity column data and then the sort on it. This degrades the performance of the query significantly when the data in table is large. I have seen with 4 million rows in table it takes ~20 sec to complete following query.
Am I missing anything here?
select TOP 100 Data.value('(/Incident/Severity)[1]', 'int') AS Severity,
Data.value('(/Incident/OwningTenantId)[1]', 'VARCHAR(800)') AS OwningTenantId,
Data.value('(/Incident/OwningTeamId)[1]', 'NVARCHAR(800)') AS OwningTeamId
FROM Incidents
WHERE Data.value('(/Incident/Severity)[1]', 'int') = 1
ORDER BY Data.value('(/Incident/OwningTenantId)[1]', 'NVARCHAR(800)')
Index:
CREATE TABLE [dbo].[Incidents](
[id] [uniqueidentifier] NOT NULL,
[Data] [xml] NOT NULL,
CONSTRAINT [PK_Incidents] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
CREATE SELECTIVE XML INDEX sxi_Incident_Data ON Incidents(Data)
FOR
(
Severity = '/Incident/Severity' AS SQL int SINGLETON,
OwningTeamId = '/Incident/OwningTeamId' AS SQL NVARCHAR(400) SINGLETON,
OwningTenantId = '/Incident/OwningTenantId' AS SQL NVARCHAR(400) SINGLETON,
id = '/Incident/_id' AS SQL BIGINT SINGLETON
)
GO
create xml index sxi_secondary_severity on Incidents(Data)
using xml index sxi_Incident_Data
for (Severity);
GO
create xml index sxi_secondary_OwningTeamId on Incidents(Data)
using xml index sxi_Incident_Data
for (OwningTeamId);
GO
create xml index sxi_secondary_OwningTenantId on Incidents(Data)
using xml index sxi_Incident_Data
for (OwningTenantId);
GO
create xml index sxi_secondary_Id on Incidents(Data)
using xml index sxi_Incident_Data
for (id);
GO
Sample XML:
<Incident>
<_id>123</_id>
<Severity>3</Severity>
<IncidentStatus>RESOLVED</IncidentStatus>
<CreateDate>2014-05-04 05:43:58.317</CreateDate>
<LastUpdateDate>2014-05-06 18:47:39.037</LastUpdateDate>
<AlertSourceLocalId>20070</AlertSourceLocalId>
<SourceIncidentId>35d0bfe4-ccb9-491f-a30c-ea7685ffe8c0</SourceIncidentId>
<SourceCreateDate>2014-05-04 02:51:14.000</SourceCreateDate>
<SourceCreatedBy>Someone</SourceCreatedBy>
<SourceModifiedDate>2014-05-04 05:43:57.797</SourceModifiedDate>
<SourceOrigin>Some Origin</SourceOrigin>
<CorrelationId>correlatioid</CorrelationId>
<RoutingId>Route123</RoutingId>
<Datacenter>Unknown</Datacenter>
<Environment>INT</Environment>
<DeviceGroup>Devicegroup</DeviceGroup>
<DeviceName>DeviceName</DeviceName>
<RaisingEnvironment>PROD</RaisingEnvironment>
<RaisingDatacenter>Unknown</RaisingDatacenter>
<RaisingDeviceGroup>DEviceGroup</RaisingDeviceGroup>
<RaisingDeviceName>FakeDevice</RaisingDeviceName>
<PrimaryIncidentId>1234</PrimaryIncidentId>
<RelatedLinksCount>0</RelatedLinksCount>
<ExternalLinksCount>0</ExternalLinksCount>
<HitCount>0</HitCount>
<ChildCount>0</ChildCount>
<Title>Some Title</Title>
<ReproSteps></ReproSteps>
<OwningTenantId>564</OwningTenantId>
<OwningTeamId>123</OwningTeamId>
<ResolveDate>2014-05-06 18:47:39.037</ResolveDate>
<ResolvedBy>SomeOne</ResolvedBy>
<MitigateDate>2014-05-06 18:45:55.403</MitigateDate>
<MitigatedBy>Someone</MitigatedBy>
<Mitigation>N/A</Mitigation>
<IsNoise>0</IsNoise>
<IsSecurityRisk>0</IsSecurityRisk>
<IsCustomerImpacting>0</IsCustomerImpacting>
<OriginatingTenantId>10066</OriginatingTenantId>
<ImpactStartDate>2014-05-01 23:31:22.000</ImpactStartDate>
<RootCauseNeedsInvestigation>0</RootCauseNeedsInvestigation>
<ConnectorTenantId>10066</ConnectorTenantId>
<RelationshipId>1852546</RelationshipId>
<SuppressAutoUpdate>0</SuppressAutoUpdate>
</Incident>
Repro:
Create Table indices
-- Create Table
IF(EXISTS(SELECT * FROM sys.tables WHERE [Name] = 'XmlTable' AND [Type] = 'U'))
BEGIN
DROP TABLE XmlTable
END
CREATE TABLE [dbo].[XmlTable](
[id] [uniqueidentifier] NOT NULL,
[Data] [xml] NULL
PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
-- Populate Data
DECLARE #i INT = 0
DECLARE #XML NVARCHAR(MAX),
#Severity INT,
#OwningTeamId VARCHAR(400),
#OwningTenantId VARCHAR(400),
#IncidentStatus varchar(100),
#Mod SMALLINT
WHILE #i < 500
BEGIN
SET #i = #i + 1
SET #Mod = #i % 3
SELECT #Severity = #Mod + 1,
#OwningTeamId = 'OwningTeam' + CAST(#Mod AS VARCHAR),
#OwningTenantId = 'OwningTenantId' + CAST(#Mod AS VARCHAR),
#IncidentStatus = CASE #Mod
WHEN 0 THEN 'Active'
WHEN 1 THEN 'Resolved'
WHEN 2 THEN 'Closed'
END
SET #XML =
'<Incident>' +
'<_id>' + CAST(#i AS VARCHAR) + '</_id>' +
'<Severity>' + CAST(#Severity AS VARCHAR) + '</Severity>' +
'<OwningTeamId>' + #OwningTeamId + '</OwningTeamId>' +
'<OwningTenantId>' + #OwningTenantId + '</OwningTenantId>' +
'<IncidentStatus>' + #IncidentStatus + '</IncidentStatus>' +
'</Incident>'
INSERT INTO XmlTable
SELECT NEWID(), #XML
END
-- Creat Indices
CREATE SELECTIVE XML INDEX [sxi_Data] ON [dbo].[XmlTable]
(
[Data]
)
FOR
(
[Severity] = '/Incident/Severity' as SQL [int] SINGLETON ,
[OwningTeamId] = '/Incident/OwningTeamId' as SQL [nvarchar](400) SINGLETON ,
[OwningTenantId] = '/Incident/OwningTenantId' as SQL [nvarchar](400) SINGLETON ,
[id] = '/Incident/_id' as SQL [bigint] SINGLETON ,
[TicketStatus] = '/Incident/IncidentStatus' as SQL [nvarchar](100) SINGLETON
)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
CREATE XML INDEX [sxi_secondary_Id] ON [dbo].[XmlTable]
(
[Data]
)USING XML INDEX [sxi_Data] FOR (
[id]
)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
CREATE XML INDEX [sxi_secondary_OwningTeamId] ON [dbo].[XmlTable]
(
[Data]
)USING XML INDEX [sxi_Data] FOR (
[OwningTeamId]
)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
USE [XMLDocuemntStore]
GO
CREATE XML INDEX [sxi_secondary_OwningTenantId] ON [dbo].[XmlTable]
(
[Data]
)USING XML INDEX [sxi_Data] FOR (
[OwningTenantId]
)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
USE [XMLDocuemntStore]
GO
CREATE XML INDEX [sxi_secondary_severity] ON [dbo].[XmlTable]
(
[Data]
)USING XML INDEX [sxi_Data] FOR (
[Severity]
)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
Sample Query: Check the query plan on right sides.
select TOP 100 Data.value('(/Incident/Severity)[1]', 'int') AS Severity
FROM XmlTable
WHERE Data.value('(/Incident/Severity)[1]', 'int') = 1
ORDER BY Data.value('(/Incident/OwningTenantId)[1]', 'NVARCHAR(800)')
The SORT TOP N is needed because of the [1] in your XPATH query. To get rid of that you'll need to ensure SQL Server the required xml element only occurs once within an incident element. For that you'll need to strongly type your XML using an XSD document. You can create one like so:
CREATE XML SCHEMA COLLECTION Incident_XSD AS
N'<?xml version="1.0" encoding="UTF-16"?>
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Incident">
<xs:complexType>
<xs:sequence>
<xs:element type="xs:int" name="_id" />
<xs:element type="xs:int" name="Severity" />
<xs:element type="xs:string" name="OwningTeamId" />
<xs:element type="xs:string" name="OwningTenantId" />
<xs:element type="xs:string" name="IncidentStatus"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>' ;
GO
Use it in your table definition like so
[Data] [xml](Incident_XSD) NULL
Now the following query is valid
select TOP 100 Data.value('/Incident[1]/Severity', 'int') AS Severity
FROM XmlTable
WHERE Data.value('/Incident[1]/Severity', 'int') = 1
ORDER BY Data.value('/Incident[1]/OwningTenantId', 'NVARCHAR(800)')
Returns within a second or 2 with a milion rows in the table.
PS: You might want to reconsider using GUIDs as primary key
This is my query:
exec sp_executesql N'set arithabort off;set statistics time on; set transaction isolation level read uncommitted;With cte as (Select peta_rn = ROW_NUMBER() OVER (ORDER BY d.LastStatusChangedDateTime desc )
, d.DocumentID,
d.IsReEfiled, d.IGroupID, d.ITypeID, d.RecordingDateTime, d.CreatedByAccountID, d.JurisdictionID,
d.LastStatusChangedDateTime as LastStatusChangedDateTime
, d.IDate, d.InstrumentID, d.DocumentStatusID
, u.Username
, it.Abbreviation AS ITypeAbbreviation
, ig.Abbreviation AS IGroupAbbreviation,
d.DocumentDate
From Documents d
Inner Join ITypes it on it.ITypeID = d.ITypeID
Inner Join Users u on d.UserID = u.UserID Inner Join IGroupes ig on ig.IGroupID = d.IGroupID
Where 1=1 And ( d.DocumentStatusID = 9 ) ) Select cte.DocumentID,
cte.IsReEfiled, cte.IGroupID, cte.ITypeID, cte.RecordingDateTime, cte.CreatedByAccountID, cte.JurisdictionID,
cte.LastStatusChangedDateTime as LastStatusChangedDateTime
, cte.IDate, cte.InstrumentID, cte.DocumentStatusID,cte.IGroupAbbreviation, cte.Username, j.JDAbbreviation, inf.DocumentName,
cte.ITypeAbbreviation, cte.DocumentDate, ds.Abbreviation as DocumentStatusAbbreviation, ds.Name as DocumentStatusName,
( SELECT CAST(CASE WHEN cte.DocumentID = (
SELECT TOP 1 doc.DocumentID
FROM Documents doc
WHERE doc.JurisdictionID = cte.JurisdictionID
AND doc.DocumentStatusID = cte.DocumentStatusID
ORDER BY LastStatusChangedDateTime)
THEN 1
ELSE 0
END AS BIT)
) AS CanChangeStatus ,
Upper((Select Top 1 Stuff( (Select ''='' + dbo.GetDocumentNameFromParamsWithPartyType(Business, FirstName, MiddleName, LastName, t.Abbreviation, NameTypeID, pt.Abbreviation, IsGrantor, IsGrantee) From DocumentNames dn
Left Join Titles t
on dn.TitleID = t.TitleID
Left Join PartyTypes pt
On pt.PartyTypeID = dn.PartyTypeID
Where DocumentID = cte.DocumentID
For XML PATH('''')),1,1,''''))) as FlatDocumentName, (SELECT COUNT(*) FROM CTE) AS TotalRecords
FROM cte Left Join DocumentStatuses ds On
cte.DocumentStatusID = ds.DocumentStatusID Left Join InstrumentFiles inf On cte.DocumentID = inf.DocumentID
Left Join Jurisdictions j on j.JurisdictionID = cte.JurisdictionID Where 1=1 And
peta_rn>#7 AND peta_rn<=#8 Order by peta_rn set statistics time off; ',N'#0 int,#1 int,#2 int,#3 int,#4 int,#5 int,#6 int,#7 int,#8 int',
#0=1,#1=5,#2=9,#3=1,#4=5,#5=9,#6=1,#7=97500,#8=97550
And this is my IGroupes table definition:
CREATE TABLE [dbo].[IGroupes](
[IGroupID] [int] IDENTITY(1,1) NOT NULL,
[Name] [varchar](64) NOT NULL,
[JurisdictionID] [int] NOT NULL,
[Abbreviation] [varchar](12) NOT NULL,
CONSTRAINT [PK_IGroupes] PRIMARY KEY NONCLUSTERED
(
[IGroupID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
SET ANSI_PADDING ON
GO
/****** Object: Index [IX_IGroupes_Abbreviation] Script Date: 10/11/2013 4:21:46 AM ******/
CREATE NONCLUSTERED INDEX [IX_IGroupes_Abbreviation] ON [dbo].[IGroupes]
(
[Abbreviation] ASC
)
INCLUDE ( [IGroupID],
[Name],
[JurisdictionID]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
GO
SET ANSI_PADDING ON
GO
/****** Object: Index [IX_IGroupes_JurisdictionID] Script Date: 10/11/2013 4:21:46 AM ******/
CREATE NONCLUSTERED INDEX [IX_IGroupes_JurisdictionID] ON [dbo].[IGroupes]
(
[JurisdictionID] ASC
)
INCLUDE ( [IGroupID],
[Name],
[Abbreviation]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
GO
SET ANSI_PADDING ON
GO
/****** Object: Index [IX_IGroupes_Name] Script Date: 10/11/2013 4:21:46 AM ******/
CREATE NONCLUSTERED INDEX [IX_IGroupes_Name] ON [dbo].[IGroupes]
(
[Name] ASC
)
INCLUDE ( [IGroupID],
[JurisdictionID],
[Abbreviation]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
GO
Yet please see it is using table scan. This operation is costing me too much. IGroupes table just has 7 rows and Documents table has approximately 98K records. Yet when I join on d.IGroupID = ig.IGroupID it shows actual number of rows above 600K! That is the problem. Please see the attached screenshot:
In case anybody is interested in the full query plan xml, here it is:
https://www.dropbox.com/s/kldx24x3j8vndpe/plan.xml
Any help is appreciated. Thanks!
None of the 3 indexes (other than the PK) you have on IGroupes are going to help this query because you are not using any of those fields in a where or join clause. Unless you need those indexes for other queries, I would delete them. They are just going to give the query optimizer more choices to test (and reject).
The index on the Primary Key PK_IGroupes should be clustered. That will allow it to do an index seek (or bookmark lookup). If it can't be clustered for some other reason, try creating an index on IGroupID and Abbreviation, in that order (or including the Abbreviation column in the existing PK index).
If it still doesn't pick up the right index, you can use a hint such as WITH(INDEX(0)) or WITH(INDEX('index-name')).
The 600k rows does come from the fact that it is doing a nested loop join on 98k rows multiplied by the 7 rows. If the index above doesn't work, you can try replacing the INNER JOIN iGroupes with INNER HASH JOIN IGroupes.
Probably in this case table scan is more efficient than using any of the indexes you have on the IGroupes table.
If you think table scan operation is bottleneck in this query (though with 3% cost I'm not sure it is) either you may try modifying PK_IGroupes to become clustered index or you may try index like
CREATE UNIQUE NONCLUSTERED INDEX [IX_IGroupes_IGroupID]
ON [dbo].[IGroupes] ([IGroupID]) INCLUDE ([Abbreviation])