Key Lookup Operator SQL Server - sql-server

I have problem with this same query on different instances of SQL Server on-premise (dev and prod). This same configuration of indexes/partitions on both.
I do not know why this from dev server works much faster than this on prod. I did notice here that dev execution plan has a Key lookup operator related to nested loop. Just can't trigger prod server to take into account key lookup also. How I can force this same on prod?
DEV :
PROD :
Query:
WITH CTE AS
(
SELECT
B.CELL_VALUE_NET, B.CELL_VALUE_NET_NEGATIVE,
ROW_NUMBER() OVER (PARTITION BY B.CHASSI_ID, B.LOG_ID, B.CELL_NO ORDER BY B.CHASSI_ID, B.LOG_ID, B.CELL_NO, B.READING_DATE) ROW_ID,
B.CELL_VALUE - LAG(B.CELL_VALUE, 1) OVER (PARTITION BY B.CHASSI_ID ORDER BY B.CHASSI_ID, B.LOG_ID, B.CELL_NO, B.READING_DATE) CELL_VALUE_NET_NEW,
b.log_id, b.reading_date, b.cell_no
FROM
REL.TEMP_CHASSI_LAST_LOAD A
JOIN
REL.MACHINE_READING_CELL B WITH (NOLOCK) ON A.CHASSI_ID = B.CHASSI_ID
AND B.ROW_CREATION_DATE BETWEEN A.MIN_ROW_CREATION_DATE AND A.MAX_ROW_CREATION_DATE
WHERE
1 = 1
AND A.CHASSI_ID IN ('A30F012437', 'A30F012546', 'A30F012545', 'A30F012558', 'A30F012657', 'A30F082351', 'A30F082332', 'A30F082325', 'A30F082290')
)
SELECT
*
-- CELL_VALUE_NET = IIF(CELL_VALUE_NET_NEW < 0,0,CELL_VALUE_NET_NEW),
--CELL_VALUE_NET_NEGATIVE = IIF(CELL_VALUE_NET_NEW < 0, CELL_VALUE_NET_NEW,NULL)
FROM
CTE
WHERE
1 = 1
AND ROW_ID > 1
Data in partitions:
All indexes are this same on both environments :
-- additional for later processes update index
CREATE NONCLUSTERED INDEX [REL.MACHINE_READING_CELL_NCI_CHASSI_ID_CELL_VALUE_CELL_VALUE_NET]
ON [REL].[MACHINE_READING_CELL] ([CHASSI_ID] ASC, [LOG_ID] ASC, [CELL_NO] ASC, [READING_DATE] ASC)
INCLUDE ([CELL_VALUE], [CELL_VALUE_NET])
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
-- partitioned indexes :
ALTER TABLE [REL].[MACHINE_READING_CELL]
ADD CONSTRAINT [REL.MACHINE_READING_CELL_PK]
PRIMARY KEY CLUSTERED ([ROW_CREATION_DATE] ASC, [CHASSI_ID] ASC, [READING_DATE] ASC, [LOG_TYPE] ASC, [LOG_ID] ASC, [CELL_NO] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,
ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
-- foreign keys:
ALTER TABLE [REL].[MACHINE_READING_CELL] WITH CHECK
ADD CONSTRAINT [REL.LOG_REL.MACHINE_READING_CELL_FK1]
FOREIGN KEY([LOG_ID]) REFERENCES [REL].[LOG] ([LOG_ID])
GO
ALTER TABLE [REL].[MACHINE_READING_CELL] CHECK CONSTRAINT [REL.LOG_REL.MACHINE_READING_CELL_FK1]
GO
-- [REL].[TEMP_CHASSI_LAST_LOAD]
CREATE CLUSTERED INDEX [IDX_MR_CELL]
ON [REL].[TEMP_CHASSI_LAST_LOAD] ([CHASSI_ID] ASC, [MIN_ROW_CREATION_DATE] ASC, [MAX_ROW_CREATION_DATE] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
Query plans :
PROD : https://www.brentozar.com/pastetheplan/?id=Hyo3bf6ac
DEV : https://www.brentozar.com/pastetheplan/?id=H1qUMG6a9

Related

SQL Server Bad query performance: Bad Rows Estimate, but Statistic is accurate?

Trying to determine why a particular query is notably slow. This query shares features with other notably slow queries, so speeding this one up will give me the tools to speed other queries up.
It's a bit of a complicated query, with 5 INNER JOINs, but the first part that seems odd to me is that my first join is underestimating the memory grant, which is causing a spill during the Hash Match.
The first part of the query is estimating the number of rows returned for the query as 98,706, which is off by a factor of 10 (the true number of rows for that query is 874,215.)
I ran sp_BlitzIndex on the IRItemAnswer_Info table, and confirmed that for the key being used by the table (IRItemAnswerInfo_DGItemID_AnswerBoolean) and the parameter in question (Leading Column Name DGItemID = 1907) the statistic is correct (874,215 rows). This is because we run a UPDATE STATISTICS IRItemAnswer_Info WITH FULLSCAN weekly during off-hours, to ensure that statistics are as correct as can reasonably be.
What else can I look to as the possible cause of the bad rows estimate? The Index exactly covers the query, so I'm confused why the estimate is off, causing an insufficient memory grant?
IRItemAnswer_Info does have one significantly large column: AnswerValue, designed to hold rich-text content. The other columns are much more reasonably sized.
CREATE TABLE [dbo].[IRItemAnswer_Info](
[ItemAnswerSID] [dbo].[T_SidDom] IDENTITY(1,1) NOT NULL,
[DGItemID] [dbo].[T_SidDom] NULL,
[IncidentID] [dbo].[T_SidDom] NULL,
[IRPhaseID] [dbo].[T_SidDom] NULL,
[AnswerTypeID] [dbo].[T_SidDom] NULL,
[AnswerSourceID] [dbo].[T_SidDom] NULL,
[AnswerValue] [varchar](8000) NULL,
[AnswerCode] [dbo].[T_CodeDom] NULL,
[AnswerLabel] [dbo].[T_LabelDom] NULL,
[IndividualReviewFlag] [dbo].[T_BooleanDom] NULL,
[ModifiedDate] [datetime] NULL,
[ModifiedBy] [dbo].[T_UseridDom] NULL,
[AnswerBoolean] [dbo].[T_BooleanDom] NULL,
[CallingItemID] [dbo].[T_SidDom] NULL,
[ClearOnDeident] [dbo].[T_BooleanDom] NULL,
[Deidentified] [dbo].[T_BooleanDom] NULL,
[Answer_AdditionalInfo] [dbo].[T_DescDom] NULL,
[SubRowPosition] [tinyint] NULL
)
The Actual Execution Plan for the full query that I'm troubleshooting
DBCC SHOW_STATISTICS (IRItemAnswer_Info,IRItemAnswerInfo_DGItemID_AnswerBoolean) WITH DENSITY_VECTOR
returns the following:
It's a large query with lots of indexes, so I'll do my best here:
CREATE NONCLUSTERED INDEX [IRItemAnswerInfo_DGItemID_AnswerBoolean] ON [dbo].[IRItemAnswer_Info]
(
[DGItemID] ASC,
[AnswerBoolean] ASC
)
INCLUDE([IncidentID],[AnswerSourceID]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [Incident_Info_Date_IDPhaseIDTimeFacDeptCreated] ON [dbo].[Incident_Info]
(
[IncidentDate] ASC
)
INCLUDE([IncidentSid],[IRPhaseId],[IncidentTime],[FacilityId],[DepartmentId],[CreatedByUserID]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [indIncidentType_XRef_IncidentTypeNodeID] ON [dbo].[IncidentType_XRef]
(
[IncidentTypeNodeID] ASC
)
INCLUDE([IncidentID]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [indIRAlternative_Info_AltSID] ON [dbo].[IRAlternative_Info]
(
[AltSID] ASC
)
INCLUDE([AltLabel]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [indIncidentTypeHierarchy_Code_NodeLeftNodeRightNodeLevel] ON [dbo].[IncidentTypeHierarchy_Code]
(
[NodeLeft] ASC,
[NodeRight] ASC,
[NodeLevel] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
ALTER TABLE [dbo].[IncidentTypeHierarchy_Code] ADD CONSTRAINT [PK_IncidentTypeHierarchy_Code] PRIMARY KEY CLUSTERED
(
[IncidentTypeNodeSID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
ALTER TABLE [dbo].[IRDGroupItem_Info] ADD CONSTRAINT [PK_IRDGroupItem_Info] PRIMARY KEY CLUSTERED
(
[DGItemSID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]

Delete all indexes of a table and rebuild them

I need to delete all indexes (other than clustered index) of a table and rebuild them all.
For example, I have the following structure:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[SP6](
[SP6_REGCLI] [char](10) NOT NULL,
[SP6_REGEMP] [char](4) NOT NULL,
[SP6_FILIAL] [char](2) NOT NULL,
[SP6_CODDEP] [char](9) NOT NULL,
[SP6_DESCRI] [char](40) NULL,
[SP6_FOLGA] [char](1) NULL,
[SP6_HESOAU] [char](1) NULL,
[SR_RECNO] [numeric](15, 0) IDENTITY(1,1) NOT NULL,
CONSTRAINT [PK__SP6] PRIMARY KEY NONCLUSTERED
(
[SR_RECNO] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE CLUSTERED INDEX [SP6_SR] ON [dbo].[SP6]
(
[SR_RECNO] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
SET ANSI_PADDING ON
GO
CREATE NONCLUSTERED INDEX [SP601_1_REG] ON [dbo].[SP6]
(
[SP6_REGCLI] ASC,
[SP6_REGEMP] ASC,
[SP6_FILIAL] ASC,
[SP6_CODDEP] ASC,
[SR_RECNO] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
SET ANSI_PADDING ON
GO
CREATE NONCLUSTERED INDEX [SP602_2_REG] ON [dbo].[SP6]
(
[SP6_REGCLI] ASC,
[SP6_REGEMP] ASC,
[SP6_FILIAL] ASC,
[SP6_DESCRI] ASC,
[SR_RECNO] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
In this structure I have to delete all the NONCLUSTERED indexes (i.e. the index SP6_SR won't be deleted).
I need to do this because I have to change the size of some columns that is part of some indexes. For example, if I try to change the size of the colum SP6_DESCRI to Char(50) an error occurs because the column is part of index SP602_2_REG.
I think that if I delete all indexes of a table, change the size of columns desired and then rebuild all indexes, everything will be fine.
Can anyone help me please?
Thank you so much!

SQL Paging (Offset, Fetch) query is very slow

I don't understand what is happening here. I am querying a single table as seen by my query below. I am only fetching the first 20 records yet the query is takes 24 seconds to complete.
Is there any way to speed up this paging query?
;WITH TempResult AS(
SELECT distinct
D.GLCompany
,D.GLAcct
,D.GLProdNum
,D.GLCostCenter
,D.FCSCompany
,D.FCSAcct
,D.FCSCostCenter
,D.JournalDetailId
,D.[EffDt]
,D.[JournalLineAmt]
,D.[JournalLineDesc]
,D.[ManagedByCd]
,D.[LegalOwnerId]
,D.[JournalLineNum]
,D.[RoundedFlagBit]
,D.[CLPreValErrCd]
,D.[GLPreValErrCd]
,D.[SuspenseErrCd]
,D.GLProfitCenter
,D.GLTradingPartner
,D.GLInternalOrder
,D.GLSubAcct
,D.GLAcctActivity
,D.GLDataSrc
,D.GLId
,D.GLProdGrp
,D.HeaderId
from MyDetail D
)
SELECT * FROM TempResult
ORDER BY TempResult.HeaderId
OFFSET 0 ROWS
FETCH NEXT 20 ROWS ONLY
OPTION(RECOMPILE)
There is a non clustered index on headerid as seen below
CREATE NONCLUSTERED INDEX [FCSAcctJournalDetail_idx] ON [dbo].[MyDetail]
(
[FCSAcct] ASC,
[FCSCompany] ASC,
[JournalEntryEffDt] ASC,
[DataDt] ASC,
[HeaderId] ASC,
[JournalDetailId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
Add an index on HeaderId:
CREATE NONCLUSTERED INDEX [FCSAcctJournalDetail_HeaderId_idx] ON [dbo].[MyDetail]
(
[HeaderId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
As David Browne wrote in his comment - the index you currently have is irrelevant to this query.
If the HeaderId was the first column in the index it would be relevant, but since it's not the first (and not even close to being the first), it's simply irrelevant in the context of this query.

How can I optimize this query which works with 24M row table?

I have a table with 24 milion rows.
I want to run this query:
select r1.userID, r2.userID, sum(r1.rate * r2.rate) as sum
from dbo.Ratings as r1
join dbo.Ratings as r2
on r1.movieID = r2.movieID
where r1.userID <= r2.userID
group by r1.userID, r2.userID
As I tested, it took 24 hours to produce 0.02 percent of the final result.
How can I speed it up?
Here is the definition of the table:
CREATE TABLE [dbo].[Ratings](
[userID] [int] NOT NULL,
[movieID] [int] NOT NULL,
[rate] [real] NOT NULL,
PRIMARY KEY CLUSTERED
(
[userID] ASC,
[movieID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [IX_RatingsMovies] ON [dbo].[Ratings]
(
[movieID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [IX_RatingsUsers] ON [dbo].[Ratings]
(
[userID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
Here is the execution plan:
The workaround I suggested was to create a "reverse" index:
CREATE INDEX IX_Ratings_Reverse on Ratings(movieid, userid) include(rate);
and then force SQL Server to use it:
select r1.userID, r2.userID, sum(r1.rate * r2.rate) as sum
from dbo.Ratings as r1 join dbo.Ratings as r2
with (index(IX_Ratings_Reverse))
on r1.movieID = r2.movieID
where r1.userID <= r2.userID group by r1.userID, r2.userID
There are two things that might help.
1) Change the order of columns in your clustered index to MovieID,UserID. This would group all the same MovieID's together first, which might change your Hash Match to an Inner Loop, and improve the performance of the JOIN.
2) Change the [IX_RatingsMovies] index to INCLUDE UserID and Rate. The more I think about it, I think this is less likely than my first suggestion to help. But it's possible.

sql performance by complex queries

I have a table like below:
CREATE TABLE MetalTemprature(
idMetalTemprature int
rawTime bigint NOT NULL,
metal nchar(7) NOT NULL,
color nchar(5) NOT NULL,
Temp float NOT NULL)
and blow index:
PRIMARY KEY CLUSTERED
(
idMetalTemprature ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON PRIMARY
) ON PRIMARY
CREATE NONCLUSTERED INDEX NonClusteredIndex1112 ON MetalTemprature
(
rawTime DESC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON PRIMARY
when I run this query take 0 sec to do that:
SELECT count(*)
FROM MetalTemprature
where rawTime < 4449449575 and rawTime > (4449449575 -10000000) and metal = 'iron';
but when i put this query under other select like below
SELECT
SELECT count(*)
FROM MetalTemprature
where rawTime < other.rawTime and rawTime > (other.rawTime -10000000) and metal = 'iron';
from other_table_only_one_row as other;
this take about 60 sec (when that other.rawTime is only 4449449575 and result of both queries is same)why?
SELECT *
from other_table_only_one_row as other, (SELECT count(*)
FROM MetalTemprature
where rawTime < other.rawTime and rawTime > (other.rawTime -10000000) and metal = 'iron') as cnt
Place it into FROM section to execute just once

Resources