Slow query in SQL Server 2016 - sql-server

I have the following query that is taking more than 1 hour to run.
SELECT
RES.NUM_PROCESS,
RES.ID_SYSTEM
FROM
RESTRICTED_PRECESS RES -- 16'000 records
WHERE
RES.ID_SYSTEM <> 'CYFV'
AND RES.NUM_PROCESS NOT IN (SELECT PR.NUM_PROCESS
FROM PRECESS PR -- 8.000.000 records
WHERE PR.ID_SYSTEM = RES.ID_SYSTEM)
The indexes for the tables are already ok.
CREATE NONCLUSTERED INDEX [IX1_PROCESS] ON [dbo].[PRECESS]
(
ID_SYSTEM ASC
)
INCLUDE(NUM_PROCESS)
here's the execution plan
Is there any way to make this SELECT return records faster?
Thank you.

I will just go ahead and suggest what might be helpful indices here for the two tables:
CREATE INDEX idx1 ON RESTRICTED_PRECESS (ID_SYSTEM, NUM_PROCESS);
CREATE INDEX idx2 ON PRECESS (ID_SYSTEM, NUM_PROCESS);
The index on the outer table RESTRICTED_PRECESS should speed up the WHERE clause, and it also completely covers the SELECT clause. The index on the table PRECESS in the subquery should speed it up as well.

Related

SQL Server: Perfomance of INNER JOIN on small table vs subquery in IN clause

Let's say I have the following two tables:
CREATE TABLE [dbo].[ActionTable]
(
[ActionID] [int] IDENTITY(1, 1) NOT FOR REPLICATION NOT NULL
,[ActionName] [varchar](80) NOT NULL
,[Description] [varchar](120) NOT NULL
,CONSTRAINT [PK_ActionTable] PRIMARY KEY CLUSTERED ([ActionID] ASC)
,CONSTRAINT [IX_ActionName] UNIQUE NONCLUSTERED ([ActionName] ASC)
)
GO
CREATE TABLE [dbo].[BigTimeSeriesTable]
(
[ID] [bigint] IDENTITY(1, 1) NOT FOR REPLICATION NOT NULL
,[TimeStamp] [datetime] NOT NULL
,[ActionID] [int] NOT NULL
,[Details] [varchar](max) NULL
,CONSTRAINT [PK_BigTimeSeriesTable] PRIMARY KEY NONCLUSTERED ([ID] ASC)
)
GO
ALTER TABLE [dbo].[BigTimeSeriesTable]
WITH CHECK ADD CONSTRAINT [FK_BigTimeSeriesTable_ActionTable] FOREIGN KEY ([ActionID]) REFERENCES [dbo].[ActionTable]([ActionID])
GO
CREATE CLUSTERED INDEX [IX_BigTimeSeriesTable] ON [dbo].[BigTimeSeriesTable] ([TimeStamp] ASC)
GO
CREATE NONCLUSTERED INDEX [IX_BigTimeSeriesTable_ActionID] ON [dbo].[BigTimeSeriesTable] ([ActionID] ASC)
GO
ActionTable has 1000 rows and BigTimeSeriesTable has millions of rows.
Now consider the following two queries:
Query A
SELECT *
FROM BigTimeSeriesTable
WHERE TimeStamp > DATEADD(DAY, -3, GETDATE())
AND ActionID IN (
SELECT ActionID
FROM ActionTable
WHERE ActionName LIKE '%action%'
)
Execution plan for query A
Query B
SELECT bts.*
FROM BigTimeSeriesTable bts
INNER JOIN ActionTable act ON act.ActionID = bts.ActionID
WHERE bts.TimeStamp > DATEADD(DAY, -3, GETDATE())
AND act.ActionName LIKE '%action%'
Execution plan for query B
Question: Why does query A have better performance than query B (sometimes 10 times better)? Shouldn't the query optimizer recognize that the two queries are exactly the same? Is there any way to provide hints that would improve the performance of the INNER JOIN?
Update: I changed the join to INNER MERGE JOIN and the performance greatly improved. See execution plan here. Interestingly when I try the merge join in the actual query I'm trying to run (which I cannot show here, confidential) it totally messes up the query optimizer and the query is super slow, not just relatively slow.
The execution plans you have supplied both have exactly the same basic strategy.
Join
There is a seek on ActionTable to find rows where ActionName starts with "generate" with a residual predicate on the ActionName LIKE '%action%'. The 7 matching rows are then used to build a hash table.
On the probe side there is a seek on TimeStamp > Scalar Operator(dateadd(day,(-3),getdate())) and matching rows are tested against the hash table to see if the rows should join.
There are two main differences which explain why the IN version executes quicker
IN
The IN version is executing in parallel. There are 4 concurrent threads working on the query execution - not just one.
Related to the parallelism this plan has a bitmap filter. It is able to use this bitmap to eliminate rows early. In the inner join plan 25,959,124 rows are passed to the probe side of the hash join, in the semi join plan the seek still reads 25.9 million rows but only 313 rows are passed out to be evaluated by the join. The remainder are eliminated early by applying the bitmap inside the seek.
It is not readily apparent why the INNER JOIN version does not execute in parallel. You could try adding the hint OPTION(USE HINT('ENABLE_PARALLEL_PLAN_PREFERENCE')) to see if you now get a plan which executes in parallel and contains the bitmap filter.
If you are able to change indexes then, given that the query only returns 309 rows for 7 distinct actions, you may well find that replacing IX_BigTimeSeriesTable_ActionID with a covering index with leading columns [ActionID], [TimeStamp] and then getting a nested loops plan with 7 seeks performs much better than your current queries.
CREATE NONCLUSTERED INDEX [IX_BigTimeSeriesTable_ActionID_TimeStamp]
ON [dbo].[BigTimeSeriesTable] ([ActionID], [TimeStamp])
INCLUDE ([Details], [ID])
Hopefully with that index in place your existing queries will just use it and you will see 7 seeks, each returning an average of 44 rows, to read and return only the exact 309 total required. If not you can try the below
SELECT CA.*
FROM ActionTable A
CROSS APPLY
(
SELECT *
FROM BigTimeSeriesTable B
WHERE B.ActionID = A.ActionID AND B.TimeStamp > DATEADD(DAY, -3, GETDATE())
) CA
WHERE A.ActionName LIKE '%action%'
I had some success using an index hint: WITH (INDEX(IX_BigTimeSeriesTable_ActionID))
However as the query changes, even slightly, this can totally hamstring the optimizer's ability to get the best query.
Therefore if you want to "materialize" a subquery in order to force it to execute earlier, your best bet as of February 2020 is to use a temp table.
For inner join there's no difference between filtering and joining
[Difference between filtering queries in JOIN and WHERE?
But here your codes create different cases
Query A: You are just filtering with 1000 record
Query B: You first join with millions of rows and then filter with 1000 records
So query A take less time than query B

Slow performance of SQL Server when adding count

I have an issue in SQL Server where I can't figure out how to solve it. I have a large product table (25m records) with a single full text search column.
Running the following query takes about 1s
CHECKPOINT;
GO
DBCC DROPCLEANBUFFERS;
GO
SELECT TOP 15
[ProductID], [EAN], [BrandID], [ShopID],
[CategoryID], [DeliveryID], [ProductPrice],
[ShippingCosts]
-- ,count(ProductID) over()
FROM
product WITH (nolock)
WHERE
CONTAINS(Search, 'Samsung AND Galaxy')
To know the total of records I tried different solutions with subqueries etc., but adding count(ProductID) over() should be a good solution.
Adding the total count part to the query makes the query very slow. Now it takes about 1m30. Changing to containstable instead of contains or using freetext makes no difference.
I included the execution plan. There are some strange values (868% Table Spool?)
But repopulating the full text index and rebuilding statistics made no difference.
Does anyone have an idea how to speed up the count?
Execution plan
It's taking a long time with the count because it has to evaluate the entire table and give you the count. The plan looks like it's finding one of your top 15, seaking the table with the count, then repeating. Without the count, it's just looking at the top 15 from the table after matching the criteria.
This may be faster but still slower than your select without an aggregate function.
CHECKPOINT;
GO
DBCC DROPCLEANBUFFERS;
GO
;WITH cte AS (
select
[ProductID]
,[EAN]
,[BrandID]
,[ShopID]
,[CategoryID]
,[DeliveryID]
,[ProductPrice]
,[ShippingCosts]
from product with (nolock)
where
contains(Search,'Samsung AND Galaxy'))
select top 15
[ProductID]
,[EAN]
,[BrandID]
,[ShopID]
,[CategoryID]
,[DeliveryID]
,ProductPrice
,ShippingCosts
,count(ProductID) over()
FROM cte

Slow Running Query - Will Indexes Help? Not sure what to do with Execution Plan

I have this slow running query below that returns 3,023 rows in SQL Server 2014 in a full minute and a half. Is there anything I can do to speed it up?
I have indexes on all the fields it's joining on. ArticleAuthor has 99 million rows and #ArticleAuthorTemp gets filled very quickly beforehand with all the IDs I need from ArticleAuthor (3,023 rows) with 0% cost of execution plan. I filled the temp table only for that purpose to limit what it's doing in the query you see here.
The execution plan for the query below is saying it's spending the most time on 2 key lookups and an index seek, each of these things at about 30%. I'm not sure how to create the needed indexes from these or if that would even help? Kind of new to index stuff. I hate to just throw indexes on everything. Even without the 2 LEFT JOINS or outer query, it's very slow so I'm thinking the real issue is with ArticleAuthor table. You'll see the indexes I have on this table below too... :)
I can provide any info you need on the execution plan if that helps.
SELECT tot.*,pu.LastName+', '+ ISNULL(pu.FirstName,'') CreatedByPerson,COALESCE(pf.updateddate,pf.CreatedDate) CreatedDatePerson
from (
SELECT CONVERT(VARCHAR(12), AA.Id) ArticleId
, 0 Citations
, AA.FullName
, AA.LastName
, AA.FirstInitial
, AA.FirstName GivenName
, AA.Affiliations
FROM ArticleAuthor AA WITH (NOLOCK)
INNER JOIN #ArticleAuthorTemp AAT ON AAT.Id = AA.Id
)tot LEFT JOIN AcademicAnalytics..pub_articlefaculty pf WITH (NOLOCK) ON tot.ArticleId = pf.SourceId
LEFT JOIN AAPortal..portal_user pu on pu.id = COALESCE(pf.updatedby,pf.CreatedBy)
Indexes:
CREATE CLUSTERED INDEX [IX_Name] ON [dbo].[ArticleAuthor]
(
[LastName] ASC,
[FirstName] ASC,
[FirstInitial] ASC
)
CREATE NONCLUSTERED INDEX [IX_ID] ON [dbo].[ArticleAuthor]
(
[Id] ASC
)
CREATE NONCLUSTERED INDEX [IX_ArticleID] ON [dbo].[ArticleAuthor]
(
[ArticleId] ASC
)
Google the CREATE INDEX statement and learn about the INCLUDES part. Use INCLUDES to eliminate Key Lookups by including all the columns that your query needs to return.

What goes wrong when I add the Where clause?

I have a simple query:
Select Distinct BOLTYPENAME, BOLTYPE.BOLTYPE From BOLTYPE
Inner Join WORKORDER on WORKORDER.BOLTYPE=BOLTYPE.BOLTYPE
Inner Join BOLMAIN On BOLMAIN.BOLID=WORKORDER.BOLID
Where BOLMAIN.CORID=156
When I run this query without the "Where" clause, it takes 0.1 secs. But adding the where clause causes it to take 1 minute to return. All tables have relevant indexes and they have been de-fragmented. The number of rows in the three tables is:
BOLTYPE: 11 rows
BOLMAIN: 71,455 rows
WORKORDER: 197,500 rows
Here are the execution plans:
Without the Where Clause (0.1 sec):
With the Where Clause (60 sec):
Any idea as to what could be the issue?
Update: Here are the relevant Index definitions:
CREATE NONCLUSTERED INDEX [BOLIDX] ON [dbo].[WORKORDER]
([BOLID] ASC)
GO
CREATE NONCLUSTERED INDEX [CORIDX] ON [dbo].[BOLMAIN]
([CORID] ASC)
INCLUDE ([BOLID])
GO
CREATE NONCLUSTERED INDEX [BOLTYPEIDX] ON [dbo].[WORKORDER]
([BOLTYPE] ASC)
GO
Recreate the CORIDX index so it covers BOLID. You're joining on BOLID, so you want it to be part of the index, not just one of the included columns.
In other words:
CREATE NONCLUSTERED INDEX [CORIDX] ON [dbo].[BOLMAIN]
([CORID] ASC, [BOLID] ASC)

Workarounds for massive performance penalty for DISTINCT on SQL Server?

when I send the following query to our db, it returns 4636 rows in < 2 seconds:
select
company3_.Un_ID as col_0_0_
from
MNT_Equipments equip
inner join
DynamicProperties dprops
on equip.propertiesId=dprops.id
inner join
DynamicPropertiesValue dvalues
on dprops.id=dvalues.dynamicPropertiesId
inner join
Companies company3_
on dvalues.companyId=COMPANY.Un_ID
where
equip.discriminator='9000'
and equip.active=1
and dvalues.propertyName='Eigentuemer'
But when I add a distinct to the select clause, it takes almost 4.5 minutes to return the remaining 40 entries. This seems to be somewhat out of proportion - what can I do to improve this, work around it or at least find out, what exactly is happening here?
Execution plans
No Distinct
With Distinct
Your help is very much appreciated!
The clustered index scans indicate that there are no good indexes on the queried tables.
If you create the following indexes the execution times should improve.
CREATE NONCLUSTERED INDEX [IX_MNT_Equipments_Active] ON [MNT_Equipments]
(
[propertiesId] ASC,
[discriminator] ASC,
[active] ASC
)
GO
CREATE NONCLUSTERED INDEX [IX_DynamicPropertiesValue_Name] ON [DynamicPropertiesValue]
(
[propertyName] ASC
)
GO

Resources