Query optimisation in SQL Server - sql-server

The following is taking a long time to get executed. Considering each table has close to 50 million to 100 million rows, is there anything that I can do from the query side to get it optimized.It is a source table and involves huge data movements (insertion ranging in millions) every hour and is better off without indexes.I don't have the required access to get an execution plan from this server.
DECLARE #var INT
SELECT TOP 1 #var = s_type
FROM s_shot
WHERE s_type < 10000
ORDER BY s_type DESC;
WITH cte AS
(
SELECT
sps_id,
SUM(xin) xin,
SUM(xout) xout
FROM
s_lock
GROUP BY
sps_id
)
SELECT
A.[s_id], a.acc_id,
B.[o_type],
B.[o_id], B.[sec_id], B.[c_id],
B.[style_type_id], [s_type],
A.[end_date],
B.[mv],
b.xin + ISNULL(c.xin, 0) [xin],
b.xout + ISNULL(c.[xout], 0) xout,
b.accr_in + b.accr_inter [acc],
b.accr_in, b.accr_inter,
b.units
FROM
s_shot a WITH (NOLOCK)
JOIN
s_pos3 b WITH (NOLOCK) ON a.s_id = b.s_id
JOIN
cte c ON b.sps_id = c.sps_id
WHERE
b.is_sl = 1
AND a.end_date > DATEADD(mm, -24, GETDATE()) -- 24 months
As you can see, I want to fetch the data of last 24 months. Is there any optimization that is possible in this query so as to bring down its execution time to manageable levels.
The create table scripts are provided below.
CREATE TABLE [dbo].[s_shot]
(
[s_id] [int] IDENTITY(1,1) NOT NULL,
[acc_id] [int] NULL,
[s_type] [int] NULL,
[end_date] [datetime] NULL,
[recon] [tinyint] NULL,
[is_pers] [tinyint] NOT NULL,
CONSTRAINT [PK_s_shot_1] PRIMARY KEY NONCLUSTERED
(
[s_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE TABLE [dbo].[s_lock]
(
[sps_id] [int] NOT NULL,
[units] [decimal](18, 5) NULL,
[accr_in] [decimal](18, 2) NULL,
[xin] [decimal](18, 2) NULL,
[xout] [decimal](18, 2) NULL,
[lock_date] [date] NOT NULL,
[accr_inter] [decimal](18, 2) NULL,
[mv] [decimal](18, 2) NULL,
CONSTRAINT [PK_s_lock] PRIMARY KEY CLUSTERED
(
[sps_id] ASC,
[lock_date] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
) ON [PRIMARY]
CREATE TABLE [dbo].[s_pos3]
(
[sps_id] [int] IDENTITY(1,1) NOT NULL,
[s_id] [int] NOT NULL,
[o_type] [tinyint] NOT NULL,
[o_id] [int] NOT NULL,
[s_id] [int] NOT NULL,
[c_id] [int] NOT NULL,
[s_type_id] [int] NOT NULL,
[units] [decimal](18, 5) NULL,
[accr_income] [decimal](18, 2) NULL,
[distr] [decimal](18, 2) NULL,
[mv] [decimal](18, 2) NULL,
[perf_stat] [smallint] NULL,
[is_slv] [bit] NULL,
[accr_inter] [decimal](18, 2) NULL,
[perf_mtd] [decimal](18, 10) NULL,
[xin] [decimal](18, 2) NULL,
[xout] [decimal](18, 2) NULL,
[a_s_type_id] [int] NULL,
CONSTRAINT [PK_s_pos3] PRIMARY KEY NONCLUSTERED
(
[sps_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
) ON [PRIMARY]
Any help is appreciated.
Edit: Updated indexing details

Related

How to speed up query from 2 tables with 400 million rows each

I have 2 tables, trans_details_sell and trans_details_buy, and both have 400 millions rows each. These 2 table are linked up with a unique column call cdr_id.
A 3rd table is deal_master, which is the master table only have about 300 master records.
My query will be using deal_master to link up trans_details_sell to get the revenue and at the mean time using trans_details_sell to link up trans_details_buy to get the cost (with cdr_id).
deal_master is using lcr_zone and customer_interconnect to link with trans_details_sell and its columns, lcr_zone and sig_netgroup.
trans_details_sell has clustered primary key (lcr_zone, sig_netgroup, cdr_id)
trans_details_buy has clustered primary key (lcr_zone, sig_netgroup, cdr_id)
Both tables have the same data structure but 1 for sell records one for buy records, and also both having CDR_ID as non-clustered unique index.
My main query, when only involve 2 table which is deal_master and trans_details_sell, the speed is ok (to get the revenue), but when add in trans_details_buy to get the cost, it will be extremely slow.
My SQL looks like this:
SELECT
m.agreement_no, m.status, m.sales_person, m.swap_carrier,
m.swap_commitment, m.zone, m.lcr_zone, m.customer_interconnect,
SUBSTRING(CAST(m.start_pos AS nvarchar), 1, 4) + '-' +
SUBSTRING(CAST(m.start_pos AS nvarchar), 5, 2) + '-' +
SUBSTRING(CAST(m.start_pos AS nvarchar), 7, 2) start_date,
SUBSTRING(CAST(m.end_pos AS nvarchar), 1, 4) + '-' +
SUBSTRING(CAST(m.end_pos AS nvarchar), 5, 2) + '-' +
SUBSTRING(CAST(m.end_pos AS nvarchar), 7, 2) end_date,
m.target_minutes, m.target_sell_rate, m.target_buy_rate,
m.target_sales, m.target_cost, m.target_profit,
SUM(s.quantized_duration) / 60 DG_minute,
SUM(s.charge) DG_sales, SUM(b.charge) DG_cost
FROM
deal_master m, trans_details_sell s, trans_details_buy b
WHERE
m.lcr_zone = s.lcr_zone
AND m.customer_interconnect = s.sig_netgroup
AND m.swap_commitment = 'Sell'
AND s.cdr_id = b.cdr_id
AND s.start_position BETWEEN m.start_pos AND m.end_pos
GROUP BY
m.agreement_no, m.status, m.sales_person, m.swap_carrier,
m.swap_commitment, m.zone, m.lcr_zone, m.customer_interconnect,
m.start_pos, m.end_pos, m.target_minutes, m.target_sell_rate,
m.target_buy_rate, m.target_sales, m.target_cost, m.target_profit
ORDER BY
1
deal_master :
CREATE TABLE [dbo].[deal_master]
(
[agreement_no] [nchar](10) NOT NULL,
[status] [nvarchar](20) NOT NULL,
[sales_person] [nvarchar](50) NOT NULL,
[swap_carrier] [nvarchar](100) NOT NULL,
[start_pos] [numeric](18, 0) NOT NULL,
[end_pos] [numeric](18, 0) NOT NULL,
[swap_commitment] [nvarchar](10) NOT NULL,
[zone] [nvarchar](200) NOT NULL,
[target_minutes] [numeric](10, 0) NULL,
[target_sell_rate] [decimal](13, 11) NULL,
[target_buy_rate] [decimal](13, 11) NULL,
[supplier_interconnect] [nvarchar](200) NOT NULL,
[customer_interconnect] [nvarchar](200) NOT NULL,
[target_sales] [numeric](10, 2) NULL,
[target_cost] [numeric](10, 2) NULL,
[target_profit] [numeric](10, 2) NULL,
[partner] [nvarchar](50) NULL,
[lcr_zone] [nvarchar](100) NOT NULL,
CONSTRAINT [pk_deal_master] PRIMARY KEY CLUSTERED
(
[lcr_zone] ASC,
[customer_interconnect] ASC,
[supplier_interconnect] ASC,
[start_pos] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
trans_details_sell:
CREATE TABLE [dbo].[trans_details_sell]
(
[cdr_id] [nchar](12) NOT NULL,
[rate] [nvarchar](50) NOT NULL,
[zone] [nvarchar](50) NOT NULL,
[charge] [decimal](13, 11) NOT NULL,
[quantized_duration] [numeric](8, 0) NOT NULL,
[sig_carrier_group] [nvarchar](50) NOT NULL,
[sig_netgroup] [nvarchar](50) NOT NULL,
[lcr] [nvarchar](100) NOT NULL,
[lcr_zone] [nvarchar](50) NOT NULL,
[per_min_chg] [decimal](13, 11) NOT NULL,
[trans_type] [nvarchar](10) NOT NULL,
[start_position] [numeric](18, 0) NOT NULL,
[end_position] [numeric](18, 0) NOT NULL,
[filename] [nvarchar](50) NOT NULL,
CONSTRAINT [pk_trans_details_sell] PRIMARY KEY CLUSTERED
(
[lcr_zone] ASC,
[sig_netgroup] ASC,
[cdr_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
trans_details_buy:
CREATE TABLE [dbo].[trans_details_sell]
(
[cdr_id] [nchar](12) NOT NULL,
[rate] [nvarchar](50) NOT NULL,
[zone] [nvarchar](50) NOT NULL,
[charge] [decimal](13, 11) NOT NULL,
[quantized_duration] [numeric](8, 0) NOT NULL,
[sig_carrier_group] [nvarchar](50) NOT NULL,
[sig_netgroup] [nvarchar](50) NOT NULL,
[lcr] [nvarchar](100) NOT NULL,
[lcr_zone] [nvarchar](50) NOT NULL,
[per_min_chg] [decimal](13, 11) NOT NULL,
[trans_type] [nvarchar](10) NOT NULL,
[start_position] [numeric](18, 0) NOT NULL,
[end_position] [numeric](18, 0) NOT NULL,
[filename] [nvarchar](50) NOT NULL,
CONSTRAINT [pk_trans_details_sell] PRIMARY KEY CLUSTERED
(
[lcr_zone] ASC,
[sig_netgroup] ASC,
[cdr_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
The query only needs charge from trans_detail_buy, so add it as an included column on the unique index on clr_id
create unique index ix_clr_id
on trans_details_buy(clr_id)
include (charge)
Otherwise the clr_id is used to look up the row locator (lcr_zone, sig_netgroup, cdr_id) which is used to seek the clustered index to find the charge.
Or if they _buy and _sell always have the same lcr_zone and sig_netgroup for each clr_id, then join trans_details_buy on all three columns to bypass the index on clr_id.

Getting the first occurence of rows

I have a table created by the following t-sql statement:
CREATE TABLE [Core].[PriceHistory](
[PriceHistoryId] [bigint] IDENTITY(1,1) NOT NULL,
[SourceId] [tinyint] NOT NULL,
[SymbolId] [smallint] NOT NULL,
[Ask] [real] NOT NULL,
[Bid] [real] NOT NULL,
[TickTime] [bigint] NOT NULL,
[ModifiedDate] [datetime2](3) NOT NULL,
[Direction] [tinyint] NULL,
CONSTRAINT [PK_PriceHistory] PRIMARY KEY CLUSTERED
(
[PriceHistoryId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Let's say I have a list of SymbolIds, for example (1, 2, 3).
I want to get the first rows foreach SymbolId having ModifiedDate is > than '2016-04-01 00:00:00'
SELECT *
FROM (
SELECT DENSE_RANK() OVER (
PARTITION BY SymbolId ORDER BY ModifiedDate
) RNK
,*
FROM PriceHistory
WHERE ModifiedDate > '2014-04-01 00:00:00'
) T
WHERE RNK = 1

SQL Server DELETE performance issues

I have a table with approximately 1 million rows. Part of our maintenance involves deleting old row each day, but this is taking about 40 minutes.
The delete statement is:
DELETE
FROM [dbGlobalPricingMatrix].[dbo].[tblPricing]
WHERE type = 'car'
AND capid NOT IN
(SELECT cder_id FROM PUB_CAR.dbo.CapDer WHERE cder_discontinued
IS NULL OR DATEDIFF(dd,cder_discontinued,GETDATE()) <= 7)
AND source = #source
Is there anything I can do to improve the performance?
Thanks
As Requested:
CREATE TABLE [dbo].[tblPricing](
[id] [int] IDENTITY(1,1) NOT NULL,
[type] [varchar](50) NULL,
[capid] [int] NULL,
[source] [varchar](50) NULL,
[product] [varchar](50) NULL,
[term] [int] NULL,
[milespa] [int] NULL,
[maintained] [bit] NULL,
[price] [money] NULL,
[created] [datetime] NULL,
[updated] [datetime] NULL,
[notes] [varchar](1000) NULL,
[painttype] [char](1) NULL,
[activeflag] [bit] NULL,
[DealerId] [int] NULL,
[FunderId] [int] NULL,
[IsSpecial] [bit] NULL,
[username] [varchar](50) NULL,
[expiry] [datetime] NULL,
CONSTRAINT [PK_tblPricing] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
) ON [PRIMARY]
CREATE TABLE [dbo].[CAPDer](
[cder_ID] [int] NOT NULL,
[cder_capcode] [char](20) NULL,
[cder_mancode] [int] NULL,
[cder_rancode] [int] NULL,
[cder_modcode] [int] NULL,
[cder_trimcode] [int] NULL,
[cder_name] [varchar](50) NULL,
[cder_introduced] [datetime] NULL,
[cder_discontinued] [datetime] NULL,
[cder_orderno] [int] NULL,
[cder_vehiclesector] [tinyint] NULL,
[cder_doors] [tinyint] NULL,
[cder_drivetrain] [char](1) NULL,
[cder_fueldelivery] [char](1) NULL,
[cder_transmission] [char](1) NULL,
[cder_fueltype] [char](1) NULL,
CONSTRAINT [PK_CapDer] PRIMARY KEY CLUSTERED
(
[cder_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
) ON [PRIMARY]
Okay, as I suspected, you do not seem to have indexes for the fields that you reference in your DELETE query. So, add indexes for type, capid, cder_discontinued, and source.
Additionally, you might want to try AND capid IN (SELECT cder_id FROM PUB_CAR.dbo.CapDer WHERE cder_discontinued IS NOT NULL AND DATEDIFF(dd,cder_discontinued,GETDATE()) > 7). The optimizer of MS-SQL-Server should actually be doing this for you, but you never know, it is worth trying.

Querying 3 tables where I'm looking for non-matches

I have three tables: LitHold, LitHoldDetails and EmailTemplate. The definitions are as follows.
CREATE TABLE [dbo].[LitHold](
[LitholdID] [int] IDENTITY(1,1) NOT NULL,
[LitHoldStatusID] [tinyint] NOT NULL,
[EmailReminderID] [tinyint] NULL,
[ApprovedDate] [datetime] NULL,
[TerminatedDate] [datetime] NULL,
CONSTRAINT [PK_Lithold] PRIMARY KEY CLUSTERED
(
[LitholdID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE TABLE [dbo].[LitHoldDetails](
[LitHoldDetailsID] [int] IDENTITY(1,1) NOT NULL,
[LitholdID] [int] NOT NULL,
[VersionID] [int] NOT NULL,
[Description] [varchar](300) NULL,
[ResAttorneyID] [varchar](10) NOT NULL,
[Comments] [varchar](1000) NULL,
[HoldStartDate] [datetime] NULL,
[HoldEndDate] [datetime] NULL,
[CreatedDate] [datetime] NOT NULL,
[CreatedByLogin] [varchar](10) NULL,
CONSTRAINT [PK_LitholdDetails] PRIMARY KEY CLUSTERED
(
[LitHoldDetailsID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE TABLE [dbo].[EmailTemplate](
[TemplateID] [int] IDENTITY(1,1) NOT NULL,
[LitHoldDetailsID] [int] NOT NULL,
[From] [varchar](50) NULL,
[To] [varchar](2000) NULL,
[CC] [varchar](500) NULL,
[BCC] [varchar](500) NULL,
[Subject] [nvarchar](200) NULL,
[MessageBody] [nvarchar](max) NULL,
[SendDate] [datetime] NULL,
[IsDefault] [bit] NOT NULL,
CONSTRAINT [PK_EmailTemplate] PRIMARY KEY CLUSTERED
(
[TemplateID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
For each LitHold, there can be multiple LitHoldDetails. For each LitHoldDetail, there should be one EmailTemplate. I recently found that some LitHoldDetails do NOT have EmailTemplates. We're still working in development on this project, so this isn't a big deal. However, we want to get the EmailTemplate data into the database. The situation is that for each LitHold, there is at least one LitHoldDetail that has an EmailTemplate. I would like to duplicate this EmailTemplate data for all the LitHoldDetails that a) have the same LitHoldID and b) do not have an EmailTemplate. One of the approaches I've tried is:
insert into EmailTemplate
(LitHoldDetailsID, [From], [To], CC, BCC, Subject, MessageBody, SendDate, IsDefault)
(select (select top 1 LitHoldDetailsID from LitHoldDetails where LitholdID = d.LitholdID and LitHoldDetailsID <> e.LitHoldDetailsID), [To], CC, BCC, Subject, MessageBody, SendDate, IsDefault from
EmailTemplate e inner join LitHoldDetails d on e.LitHoldDetailsID = d.LitHoldDetailsID)
but this gets me multiple rows for some LitHoldDetails, with different EmailTemplate data, and some rows where LitHoldDetails is NULL. How can I accomplish this? I'm using SQL Server 2008.
Try inserting this:
SELECT lhd.LitHoldDetailsID, CloneEmailTemplate.[From], ...
FROM LitHoldDetails lhd
CROSS APPLY (SELECT TOP 1 et.*
FROM EmailTemplate et
JOIN LitHoldDetails lhd2 ON lhd2.LitHoldDetailsID = et.LitHoldDetailsID
WHERE lhd2.LitHoldID = lhd.LitHoldID
) AS CloneEmailTemplate
WHERE NOT EXISTS (SELECT 1
FROM EmailTemplate et2
WHERE et2.LitHoldDetailsID = lhd.LitHoldDetailsID
)

Why is this join taking so long?

I have the following query that I am running on my database server but it takes about 30 seconds to run and I can't work out why this is.
SELECT *
FROM [dbo].[PackageInstance] AS packInst
INNER JOIN [dbo].[PackageDefinition] AS packageDef
ON packInst.[PackageDefinitionID] = packageDef.[PackageDefinitionID]
LEFT OUTER JOIN [dbo].[PackageInstanceContextDef] AS contextDef
ON packInst.[PackageInstanceID] = contextDef.[PackageInstanceID]
This produced the following execution plan which to me looks to be good....so I can't understand why it takes so much time to execute where the resulting data is only 100,000 records (which should be a walk in the park for SQL Server).
Any ideas what could be causing this long execution time?
I have looked at the query in Profiler to see what the stats where on it and they are as follows:
CPU - 4711
Reads - 744453
Writes - 9
Duration - 26329
The following are the table definitions:
CREATE TABLE [dbo].[PackageDefinition](
[PackageDefinitionID] [int] IDENTITY(1,1) NOT NULL,
[ts] [timestamp] NOT NULL,
[ProgramID] [int] NULL,
[VendorID] [int] NULL,
[PackageExecutionTypeID] [int] NULL,
[PackageDefinitionStatusID] [int] NOT NULL,
[IsInternal] [bit] NOT NULL,
[Name] [dbo].[D_Name] NOT NULL,
[Description] [dbo].[D_Description] NOT NULL,
[CreatedDate] [datetime] NOT NULL,
[PublishedDate] [datetime] NULL,
[OwnerUserGuid] [uniqueidentifier] NOT NULL,
[ProcessDefinitionMainID] [int] NULL,
[KeyInfoHtml] [nvarchar](max) NULL,
[DescriptionHtml] [nvarchar](max) NULL,
[WhatToExpectHtml] [nvarchar](max) NULL,
[BestPracticesHtml] [nvarchar](max) NULL,
[RecommendedJourneysHtml] [nvarchar](max) NULL,
[RequiresSLAAgreement] [bit] NOT NULL,
[SLAFileAssetID] [int] NULL,
[ImageDataID] [int] NULL,
[VideoHtml] [nvarchar](max) NULL,
[VideoAssetID] [int] NULL,
[UseMapCosts] [bit] NOT NULL,
[CostMin] [money] NOT NULL,
[CostMax] [money] NOT NULL,
[LandingPageVisitCount] [int] NOT NULL,
[IsDeleted] [dbo].[D_IsDeleted] NOT NULL,
[CreatedByUserGuid] [uniqueidentifier] NOT NULL,
[OrderHtml] [nvarchar](max) NULL,
CONSTRAINT [PK_PackageDefinition] PRIMARY KEY CLUSTERED
(
[PackageDefinitionID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE TABLE [dbo].[PackageInstance](
[PackageInstanceID] [int] IDENTITY(1,1) NOT NULL,
[ts] [timestamp] NOT NULL,
[PackageDefinitionID] [int] NOT NULL,
[PackageStatusID] [int] NOT NULL,
[Name] [dbo].[D_Description] NOT NULL,
[CampaignID] [int] NULL,
[MarketingPlanID] [int] NULL,
[CountryID] [int] NULL,
[DateEntered] [datetime] NULL,
[DateExecuted] [datetime] NULL,
[ProcessID] [int] NULL,
[OrderedByUserGuid] [uniqueidentifier] NULL,
[RequestedByUserGuid] [uniqueidentifier] NULL,
[SLAEndDate] [datetime] NULL,
CONSTRAINT [PK_PackageInstance] PRIMARY KEY CLUSTERED
(
[PackageInstanceID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE TABLE [dbo].[PackageInstanceContextDef](
[PackageInstanceContextDefID] [int] IDENTITY(1,1) NOT NULL,
[ts] [timestamp] NOT NULL,
[PackageInstanceID] [int] NOT NULL,
[ContextObjectDefID] [int] NOT NULL,
[EnteredFieldValue] [varchar](max) NULL,
[SelectedListValueID] [int] NULL,
[AssetIdsString] [nvarchar](max) NULL,
[SelectedListValueIdsString] [nvarchar](max) NULL,
[ContextObjectFieldName] [nvarchar](30) NOT NULL,
CONSTRAINT [PK_PackageInstanceContextDef] PRIMARY KEY CLUSTERED
(
[PackageInstanceContextDefID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Remove the * in SELECT *
It will always scan because you ask for all columns. And do you have clustered indexes?
The answer turned out to be what #MartinSmith suggested. Because the PackageDefinition table contained about 8 NVARCHAR(MAX) columns, when the resulting join was created and that was over 100k rows, this was causing the varchar(max) values to be re-read over and over and they exist in out of row pages. Hence the large number of logical reads.
Thanks all for your support, just have to figure out to make the entity framework produce the query that I want.
What happens if you add the following index...
CREATE NONCLUSTERED INDEX ix ON PackageDefinition(PackageDefinitionID)
...and try the following to reduce the width of the data going into the sort?
SELECT packInst.*,
packageDef2.*,
contextDef.*
FROM [dbo].[PackageInstance] AS packInst
INNER MERGE JOIN [dbo].[PackageDefinition] AS packageDef
ON packInst.[PackageDefinitionID] = packageDef.[PackageDefinitionID]
LEFT OUTER MERGE JOIN [dbo].[PackageInstanceContextDef] AS contextDef
ON packInst.[PackageInstanceID] = contextDef.[PackageInstanceID]
INNER MERGE JOIN [dbo].[PackageDefinition] AS packageDef2
ON packageDef.[PackageDefinitionID] = packageDef2.[PackageDefinitionID]
OF course * should not be used as even if you need all columns you definitely won't need the same columns twice as the result of the JOIN but this is just to maintain the semantics of your original query.

Resources