Clustered Index Scan vs Index Seek - sql-server

Will including more columns in the clustered index for a table increase the chance of turning index scans into index seeks? I'm trying to increase responsiveness with operations involved with a heavily read table. It has 199 columns and currently 65k rows, is it even worth it?
Here's the query:
SELECT T1.SALESID,T1.SALESNAME,T1.RESERVATION,T1.CUSTACCOUNT,T1.INVOICEACCOUNT,T1.DELIVERYDATE,T1.URL,T1.PURCHORDERFORMNUM,T1.SALESGROUP,T1.FREIGHTSLIPTYPE,T1.DOCUMENTSTATUS,T1.INTERCOMPANYORIGINALSALESID,T1.CURRENCYCODE,T1.PAYMENT,T1.CASHDISC,T1.TAXGROUP,T1.LINEDISC,T1.CUSTGROUP,T1.DISCPERCENT,T1.INTERCOMPANYORIGINALCUSTACCOUNT,T1.PRICEGROUPID,T1.MULTILINEDISC,T1.ENDDISC,T1.CUSTOMERREF,T1.LISTCODE,T1.DLVTERM,T1.DLVMODE,T1.PURCHID,T1.SALESSTATUS,T1.MARKUPGROUP,T1.SALESTYPE,T1.SALESPOOLID,T1.POSTINGPROFILE,T1.TRANSACTIONCODE,T1.INTERCOMPANYAUTOCREATEORDERS,T1.INTERCOMPANYDIRECTDELIVERY,T1.INTERCOMPANYDIRECTDELIVERYORIG,T1.SETTLEVOUCHER,T1.INTERCOMPANYALLOWINDIRECTCREATION,T1.INTERCOMPANYALLOWINDIRECTCREATIONORIG,T1.DELIVERYNAME,T1.ONETIMECUSTOMER,T1.COVSTATUS,T1.COMMISSIONGROUP,T1.PAYMENTSCHED,T1.INTERCOMPANYORIGIN,T1.EMAIL,T1.FREIGHTZONE,T1.RETURNITEMNUM,T1.CASHDISCPERCENT,T1.CONTACTPERSONID,T1.DEADLINE,T1.PROJID,T1.INVENTLOCATIONID,T1.ADDRESSREFTABLEID,T1.VATNUM,T1.PORT,T1.INCLTAX,T1.NUMBERSEQUENCEGROUP,T1.FIXEDEXCHRATE,T1.LANGUAGEID,T1.AUTOSUMMARYMODULETYPE,T1.SALESORIGINID,T1.ESTIMATE,T1.TRANSPORT,T1.PAYMMODE,T1.PAYMSPEC,T1.FIXEDDUEDATE,T1.EXPORTREASON,T1.STATPROCID,T1.INTERCOMPANYCOMPANYID,T1.INTERCOMPANYPURCHID,T1.INTERCOMPANYORDER,T1.DLVREASON,T1.QUOTATIONID,T1.RECEIPTDATEREQUESTED,T1.RECEIPTDATECONFIRMED,T1.SHIPPINGDATEREQUESTED,T1.SHIPPINGDATECONFIRMED,T1.ITEMTAGGING,T1.CASETAGGING,T1.PALLETTAGGING,T1.ADDRESSREFRECID,T1.CUSTINVOICEID,T1.INVENTSITEID,T1.DEFAULTDIMENSION,T1.CREDITCARDCUSTREFID,T1.SHIPCARRIERACCOUNT,T1.SHIPCARRIERID,T1.SHIPCARRIERFUELSURCHARGE,T1.SHIPCARRIERBLINDSHIPMENT,T1.SHIPCARRIERDELIVERYCONTACT,T1.CREDITCARDAPPROVALAMOUNT,T1.CREDITCARDAUTHORIZATION,T1.RETURNDEADLINE,T1.RETURNREPLACEMENTID,T1.RETURNSTATUS,T1.RETURNREASONCODEID,T1.CREDITCARDAUTHORIZATIONERROR,T1.SHIPCARRIERACCOUNTCODE,T1.RETURNREPLACEMENTCREATED,T1.SHIPCARRIERDLVTYPE,T1.DELIVERYDATECONTROLTYPE,T1.SHIPCARRIEREXPEDITEDSHIPMENT,T1.SHIPCARRIERRESIDENTIAL,T1.MATCHINGAGREEMENT,T1.SYSTEMENTRYSOURCE,T1.SYSTEMENTRYCHANGEPOLICY,T1.MANUALENTRYCHANGEPOLICY,T1.DELIVERYPOSTALADDRESS,T1.SHIPCARRIERPOSTALADDRESS,T1.SHIPCARRIERNAME,T1.WORKERSALESTAKER,T1.SOURCEDOCUMENTHEADER,T1.BANKDOCUMENTTYPE,T1.SALESUNITID,T1.SMMSALESAMOUNTTOTAL,T1.SMMCAMPAIGNID,T1.CASHDISCBASEDATE,T1.CASHDISCBASEDAYS,T1.PDSBATCHATTRIBAUTORES,T1.PDSCUSTREBATEGROUPID,T1.PDSREBATEPROGRAMTMAGROUP,T1.WORKERSALESRESPONSIBLE,T1.AA_INITIALCONDITION,T1.AA_MATERIAL,T1.AA_PONUMBER,T1.AA_REFERENCENUMBER,T1.AA_FINALCONDITION,T1.AA_TEMPLATEID,T1.AA_PRIME,T1.AA_CONTAINER,T1.AA_ENTEREDDATE,T1.AA_ENTEREDDATETZID,T1.AA_EXPEDITE,T1.AA_PROMISEDDATE,T1.AA_PROMISEDDATETZID,T1.AA_GROSSWEIGHT,T1.AA_ALLOY,T1.AA_ITAR,T1.AA_EAR,T1.AA_GSI,T1.AA_SOLUTIONTREAT,T1.AA_BASICFORMID,T1.AA_PLANNEDBY,T1.AA_CLASSIFICATION,T1.INVENTPACKINGMATERIALCODE,T1.AA_CSI,T1.AA_QTY,T1.AA_BILLINGADDRESS,T1.AA_PROCESSMASTER,T1.AA_BILLINGNAME,T1.AA_PACKAGING,T1.AA_HIDECERTMEASURE,T1.AA_INVOICEAPPROVED,T1.AA_PROSHIPMEMOPRINTED,T1.AA_CERTIFIEDDATE,T1.TRI_OPSTATUS,T1.MODIFIEDDATETIME,T1.MODIFIEDBY,T1.CREATEDDATETIME,T1.CREATEDBY,T1.RECVERSION,T1.PARTITION,T1.RECID,T2.PERSON,T2.RECVERSION,T2.RECID,T3.LOCATION,T3.ADDRESS,T3.VALIDFROM,T3.VALIDFROMTZID,T3.VALIDTO,T3.VALIDTOTZID,T3.COUNTRYREGIONID,T3.RECVERSION,T3.RECID,T4.ADDRESS,T4.COUNTRYREGIONID,T4.LOCATION,T4.VALIDFROM,T4.VALIDFROMTZID,T4.RECVERSION,T4.RECID,T5.LOCATION,T5.ADDRESS,T5.VALIDFROM,T5.VALIDFROMTZID,T5.RECVERSION,T5.RECID,T6.NAME,T6.RECID,T6.RECVERSION,T6.INSTANCERELATIONTYPE,T6.NAMESEQUENCE,T6.RECVERSION,T6.RECID,T7.DESCRIPTION,T7.LOCATIONID,T7.RECVERSION,T7.RECID,T8.DESCRIPTION,T8.LOCATIONID,T8.RECVERSION,T8.RECID,T1.AA_COMMENTS FROM SALESTABLE T1 LEFT OUTER JOIN HCMWORKER T2 ON ((T2.PARTITION=#P1) AND ((T1.WORKERSALESTAKER=T2.RECID) AND (T1.WORKERSALESTAKER=T2.RECID))) LEFT OUTER JOIN LOGISTICSPOSTALADDRESS T3 ON ((T3.PARTITION=#P2) AND ((T1.DELIVERYPOSTALADDRESS=T3.RECID) AND ((1=#P3 OR (T3.ISPRIVATE=#P4)) OR (T3.PRIVATEFORPARTY=#P5)))) LEFT OUTER JOIN LOGISTICSPOSTALADDRESS T4 ON ((T4.PARTITION=#P6) AND ((T1.SHIPCARRIERPOSTALADDRESS=T4.RECID) AND ((1=#P7 OR (T4.ISPRIVATE=#P8)) OR (T4.PRIVATEFORPARTY=#P9)))) LEFT OUTER JOIN LOGISTICSPOSTALADDRESS T5 ON ((T5.PARTITION=#P10) AND ((T1.AA_BILLINGADDRESS=T5.RECID) AND ((1=#P11 OR (T5.ISPRIVATE=#P12)) OR (T5.PRIVATEFORPARTY=#P13)))) LEFT OUTER JOIN DIRPARTYTABLE T6 ON ((((T6.PARTITION=#P14) AND (T6.INSTANCERELATIONTYPE IN (#P15,#P16,#P17,#P18,#P19,#P20,#P21) )) AND ((T2.PERSON=T6.RECID) AND (T2.PERSON=T6.RECID))) AND (T6.INSTANCERELATIONTYPE IN (2975) )) LEFT OUTER JOIN LOGISTICSLOCATION T7 ON ((T7.PARTITION=#P22) AND (T3.LOCATION=T7.RECID)) LEFT OUTER JOIN LOGISTICSLOCATION T8 ON ((T8.PARTITION=#P23) AND (T5.LOCATION=T8.RECID)) WHERE (((T1.PARTITION=#P24) AND (T1.DATAAREAID=#P25)) AND (( NOT ((T1.RETURNSTATUS=#P26)) AND NOT ((T1.RETURNSTATUS=#P27))) AND (((((((((((((((((((((((((T1.SALESNAME=#P28) AND (T1.SALESID=#P29)) AND (T2.PERSON IS NULL)) AND (T3.LOCATION=#P30)) AND (T3.VALIDFROM=#P31)) AND (T4.LOCATION IS NULL)) AND (T5.LOCATION=#P32)) AND (T5.VALIDFROM=#P33)) AND (T6.NAME IS NULL)) AND (T6.NAMESEQUENCE IS NULL)) AND (T7.LOCATIONID=#P34)) AND (T8.LOCATIONID>=#P35)) OR (((((((((((T1.SALESNAME=#P36) AND (T1.SALESID=#P37)) AND (T2.PERSON IS NULL)) AND (T3.LOCATION=#P38)) AND (T3.VALIDFROM=#P39)) AND (T4.LOCATION IS NULL)) AND (T5.LOCATION=#P40)) AND (T5.VALIDFROM=#P41)) AND (T6.NAME IS NULL)) AND (T6.NAMESEQUENCE IS NULL)) AND (T7.LOCATIONID>#P42))) OR ((((((((((T1.SALESNAME=#P43) AND (T1.SALESID=#P44)) AND (T2.PERSON IS NULL)) AND (T3.LOCATION=#P45)) AND (T3.VALIDFROM=#P46)) AND (T4.LOCATION IS NULL)) AND (T5.LOCATION=#P47)) AND (T5.VALIDFROM=#P48)) AND (T6.NAME IS NULL)) AND NOT ((T6.NAMESEQUENCE IS NULL)))) OR ((((((((((T1.SALESNAME=#P49) AND (T1.SALESID=#P50)) AND (T2.PERSON IS NULL)) AND (T3.LOCATION=#P51)) AND (T3.VALIDFROM=#P52)) AND (T4.LOCATION IS NULL)) AND (T5.LOCATION=#P53)) AND (T5.VALIDFROM=#P54)) AND (T6.NAME IS NULL)) AND NOT ((T6.RECID IS NULL)))) OR (((((((((T1.SALESNAME=#P55) AND (T1.SALESID=#P56)) AND (T2.PERSON IS NULL)) AND (T3.LOCATION=#P57)) AND (T3.VALIDFROM=#P58)) AND (T4.LOCATION IS NULL)) AND (T5.LOCATION=#P59)) AND (T5.VALIDFROM=#P60)) AND NOT ((T6.NAME IS NULL)))) OR ((((((((T1.SALESNAME=#P61) AND (T1.SALESID=#P62)) AND (T2.PERSON IS NULL)) AND (T3.LOCATION=#P63)) AND (T3.VALIDFROM=#P64)) AND (T4.LOCATION IS NULL)) AND (T5.LOCATION=#P65)) AND (T5.VALIDFROM>#P66))) OR (((((((T1.SALESNAME=#P67) AND (T1.SALESID=#P68)) AND (T2.PERSON IS NULL)) AND (T3.LOCATION=#P69)) AND (T3.VALIDFROM=#P70)) AND (T4.LOCATION IS NULL)) AND (T5.LOCATION>#P71))) OR (((((((T1.SALESNAME=#P72) AND (T1.SALESID=#P73)) AND (T2.PERSON IS NULL)) AND (T3.LOCATION=#P74)) AND (T3.VALIDFROM=#P75)) AND (T4.LOCATION IS NULL)) AND NOT ((T4.VALIDFROM IS NULL)))) OR ((((((T1.SALESNAME=#P76) AND (T1.SALESID=#P77)) AND (T2.PERSON IS NULL)) AND (T3.LOCATION=#P78)) AND (T3.VALIDFROM=#P79)) AND NOT ((T4.LOCATION IS NULL)))) OR (((((T1.SALESNAME=#P80) AND (T1.SALESID=#P81)) AND (T2.PERSON IS NULL)) AND (T3.LOCATION=#P82)) AND (T3.VALIDFROM>#P83))) OR ((((T1.SALESNAME=#P84) AND (T1.SALESID=#P85)) AND (T2.PERSON IS NULL)) AND (T3.LOCATION>#P86))) OR (((T1.SALESNAME=#P87) AND (T1.SALESID=#P88)) AND NOT ((T2.PERSON IS NULL)))) OR ((T1.SALESNAME=#P89) AND (T1.SALESID>#P90))) OR (T1.SALESNAME>#P91)))) ORDER BY T1.SALESNAME,T1.SALESID,T2.PERSON,T3.LOCATION,T3.VALIDFROM,T4.LOCATION,T4.VALIDFROM,T5.LOCATION,T5.VALIDFROM,T6.NAME,T6.RECID,T6.NAMESEQUENCE,T7.LOCATIONID,T8.LOCATIONID OPTION(FAST 2)

It is not possible to pick the columns in a Clustered Index. It always includes all columns.
In case you meant key columns: There is no "chance" here. It depends on the query whether an index is a good fit or not. Without schema and query nothing can be said. Refer to pretty much any indexing tutorial to answer this yourself.

Will adding more columns to the clustered index key, aka Primary Key, possibly turn Index Scans into Index Seeks? Yes, but query tuning via Primary Key can create other complications. It may be less painful to just add a Non Clustered Index on the columns that you want "seeked".
Can you provide the problem query?

Related

Join from a table(CategoryStores) of 24k records taking a longer time

select distinct StoreUID,
STOCode,
STODescription,
STODateOfBirth,
STOLevel,
RGNDescription as [Region],
RGNOrder,
STOActive,
STOAreaSqFeet,
STOTotalSqFeet,
STOModifiedDate,
STOModifiedBy,
case when
isnull(STCCompM1,'N')='Y' or
isnull(STCCompM2,'N')='Y' or
isnull(STCCompM3,'N')='Y' or
isnull(STCCompM4,'N')='Y' or
isnull(STCCompM5,'N')='Y' or
isnull(STCCompM6,'N')='Y' or
isnull(STCCompM7,'N')='Y' or
isnull(STCCompM8,'N')='Y' or
isnull(STCCompM9,'N')='Y' or
isnull(STCCompM10,'N')='Y' or
isnull(STCCompM11,'N')='Y' or
isnull(STCCompM12,'N')='Y'or
isnull(STCCompM13,'N')='Y' then 1 else 0 end as CompStore
from [Store]
cross join #CategoryStores plcat
inner join [Region] on RegionUID=STORegionUID
inner join UserStores us on us.USTStoreUID = StoreUID and us.USTUserUID=#UserUID
inner join **CategoryStores** cs on plcat.CategoryUID = cs.CSTCategoryUID and StoreUID = cs.CSTStoreUID
left outer join StoreComp on STCStoreUID=StoreUID and STCYear=#BudgetYear - 1
where STOActive = 1
order by STOCode
There are only two foreign key columns in CategoryStores(CSTCategoryUID bigint, CSTStoreUID bigint). This is taking 40 seconds join with CategoryStores and without only 2 secs. How will i improve the performance?
Make sure the foreign keys that are used for the inner join connection, are indexed properly.
Moreover, try to use Cross Apply in order to improve performance of your query for example:
instead of
inner join [Region] on RegionUID = STORegionUID
you can use:
CROSS APPLY
(SELECT TOP (1) FROM [Region]
WHERE [Region].RegionUID = [Store].STORegionUID) ca
Read more about cross apply in the attached link.

Ridding this query of Hash Match Join?

I have the following query that takes around 26 rows to return 8700 rows of data.
SELECT
R.ClientReferralID,
R.ClientID,
C.FirstName,
C.LastName,
C.FullName,
dbo.fnGetLocalDate(R.ReferralDate) as ReferralDate,
RT.ReferralTypeName,
R.ReferralTypeOther,
RT2.ReferredToName,
R.ReferredToOther,
R.ReferredByID,
U.FullName as ReferredBy,
TS.TimeSpentName,
R.Notes,
L.ReferralLocationID,
L.ReferralLocationName as Location,
R.ReferralLetterSentID,
R.ReferralLetterOnFileID,
dbo.fnGetLocalDate(R.DateCreated) as DateCreated,
U2.FullName as UserCreated,
dbo.fnGetLocalDate(R.DateModified) as DateModified,
U3.FullName as UserModified
FROM
ClientReferral R
INNER JOIN Client C on
R.ClientID = C.ClientID
INNER JOIN LookUp.ReferralType RT on
R.ReferralTypeID = RT.ReferralTypeID
INNER JOIN LookUp.ReferredTo RT2 on
R.ReferredToID = RT2.ReferredToID
INNER JOIN UserAccount U on
R.ReferredByID = U.UserAccountID
INNER JOIN LookUp.TimeSpent TS on
R.TimeSpentID = TS.TimeSpentID
INNER JOIN LookUp.ReferralLocation L on
R.ReferralLocationID = L.ReferralLocationID
INNER JOIN UserAccount U2 on
R.UserCreated = U2.UserAccountID
LEFT JOIN UserAccount U3 on
R.UserModified = U3.UserAccountID
WHERE
(R.ReferralDate >= #StartDate or #StartDate is null) and
(R.ReferralDate <= #EndDate or #EndDate is null)
ORDER BY
R.DateCreated DESC
The execution plan can be viewed here:
https://www.brentozar.com/pastetheplan/?id=B1A5ji7tf
I see the most costly operation is 65% on a Hash Match Join. I was expecting the following index to improve that but no:
CREATE NONCLUSTERED INDEX [Name] ON [dbo].[ClientReferral]
(
[ClientID] ASC
)
Anyone see off hand what I can do here? Please let me know if some sample data is needed.
Try adding an index
ClientReferral ReferralDate
Or
ClientReferral ReferralDate, ClientID
ClientReferral ClientID, ReferralDate
First change
WHERE (R.ReferralDate >= #StartDate or #StartDate is null) and (R.ReferralDate <= #EndDate or #EndDate is null)
To
WHERE R.ReferralDate BETWEEN ISNULL(#StartDate,CAST(0 AS datetime2)) AND ISNULL(#EndDate,CAST(999999 AS datetime2))
After that create an index on
ClientReferral(ReferralDate, ClientID, ReferralTypeID, ReferredToID, ReferredByID, TmeSpentID, ReferralLocationID, UserCreated, UserModified, DateCreated, DateModified) INCLUDE(ClientReferralID, ReferralTypeOther, ReferredToOther, Notes, ReferralLetterSentID, ReferralLetterOnFileID)
Also show the code for
fnGetLocalDate

How can I improve query performance that blocks other users?

I have the complex sub-select statement that slows down my query and also blocks other users.
select
(Case when (Select COUNT(*) from tblQuoteDetails QD where QD.QuoteGUID = a.QuoteGUID) > 1 then
(SELECT Round(Sum(dbo.tblQuoteOptions.Premium),2)
FROM dbo.tblQuotes AS Q
INNER JOIN dbo.lstQuoteStatus ON Q.QuoteStatusID = dbo.lstQuoteStatus.QuoteStatusID
INNER JOIN dbo.tblQuoteOptions ON Q.QuoteGUID = dbo.tblQuoteOptions.QuoteGUID
--INNER JOIN dbo.tblQuoteOptionPremiums ON dbo.tblQuoteOptionPremiums.QuoteOptionGuid = dbo.tblQuoteOptions.QuoteOptionGUID
WHERE (Q.ControlNo = a.ControlNo)
AND (Q.OriginalQuoteGUID IS NULL)
AND (dbo.tblQuoteOptions.Premium <> 0)
AND (DATEDIFF(d,ISNULL(null, dbo.GetEffectiveDate(Q.QuoteGUID)), dbo.GetEffectiveDate(Q.QuoteGUID)) <= 0))
Else
(SELECT Round(Avg(dbo.tblQuoteOptions.Premium),2)
FROM dbo.tblQuotes AS Q
INNER JOIN dbo.lstQuoteStatus ON Q.QuoteStatusID = dbo.lstQuoteStatus.QuoteStatusID
INNER JOIN dbo.tblQuoteOptions ON Q.QuoteGUID = dbo.tblQuoteOptions.QuoteGUID
--INNER JOIN dbo.tblQuoteOptionPremiums ON dbo.tblQuoteOptionPremiums.QuoteOptionGuid = dbo.tblQuoteOptions.QuoteOptionGUID
WHERE (Q.ControlNo = a.ControlNo)
AND (Q.OriginalQuoteGUID IS NULL)
AND (dbo.tblQuoteOptions.Premium <> 0)
AND (DATEDIFF(d,ISNULL(null, dbo.GetEffectiveDate(Q.QuoteGUID)), dbo.GetEffectiveDate(Q.QuoteGUID)) <= 0))
--GROUP BY dbo.tblQuoteOptions.QuoteOptionID
End) As QuotedPremium
FROM tblQuotes a
Not sure if I'm reading execution plan correctly but that's what I see:
Any idea what approach should I take here?
Thanks
Looking into the query completely without having access to your environment won't be terribly efficient, but I can safely say that Key Lookups are expensive, and can often be eliminated by ensuring the columns you're getting from a joined table are INCLUDEd in the index being used. Considering the two key lookups we can amount to almost 80% of the query cost, I'd start there.
Also, part of the issue is the use of DATEDIFF inside a WHERE clause.
AND (DATEDIFF(d,ISNULL(null, dbo.GetEffectiveDate(Q.QuoteGUID))dbo.GetEffectiveDate(Q.QuoteGUID)) <= 0))
This will severely hamper the optimizer from doing it's job. Simplifying this particular comparison could make a big difference.

FULL OUTER JOIN is not working

I need to return all vendors regardless of whether there has been a purchase from that vendor. The query is currently only returning records where the vendor had a purchase.
SELECT vendors.NAME,
Iif([fundingsourceid] = 10, [amount], 0) AS Credit,
Iif(( [fundingsourceid] = 2 )
OR ( [fundingsourceid] = 3 ), [amount], 0) AS EBT,
Iif([fundingsourceid] = 4, [amount], 0) AS [Match],
cardpurchases.updateddate
FROM vendors
FULL OUTER JOIN cardpurchases
ON cardpurchases.vendorid = vendors.vendorid
LEFT JOIN cardfundings
ON cardpurchases.cardfundingid = cardfundings.cardfundingid
INNER JOIN marketevents
ON cardpurchases.marketeventid = marketevents.marketeventid
INNER JOIN markets
ON marketevents.marketid = markets.marketid
WHERE (cardpurchases.updateddate >= '10/22/2014' OR cardpurchases.updateddate IS NULL)
AND (cardpurchases.updateddate < '10/23/2014' OR cardpurchases.updateddate IS NULL)
AND (markets.marketid = 47 OR markets.marketid IS NULL)
ORDER BY vendors.NAME
Although you have specified a FULL OUTER JOIN later in your query you are restricting the resultset based on columns in the cardpurchases table which is causing vendors which have no cardpurchases to disappear.
You can do either of the following:
WHERE
((cardpurchases.updateddate >= '10/22/2014'
AND cardpurchases.updateddate < '10/23/2014')
OR cardpurchases.updateddate IS NULL)
AND markets.marketid = 47
Or
FROM vendors
LEFT JOIN cardpurchases
ON cardpurchases.vendorid = vendors.vendorid
AND cardpurchases.updateddate >= '10/22/2014'
AND cardpurchases.updateddate < '10/23/2014')
You need to account for NULLs in your WHERE clause:
WHERE (cardpurchases.updateddate >= '10/22/2014' OR cardpurchases.updateddate IS NULL)
AND (cardpurchases.updateddate < '10/23/2014' OR cardpurchases.updateddate IS NULL)
AND (markets.marketid = 47 OR markets.marketid IS NULL)
You also should use parentheses to control the join so that the later INNER JOINs don't poison it:
FROM vendors
FULL OUTER JOIN (cardpurchases
LEFT JOIN cardfundings
ON cardpurchases.cardfundingid = cardfundings.cardfundingid
INNER JOIN marketevents
ON cardpurchases.marketeventid = marketevents.marketeventid
INNER JOIN markets
ON marketevents.marketid = markets.marketid)
ON cardpurchases.vendorid = vendors.vendorid
That's only partial, since you'll experience the same issue with the LEFT JOIN as you do with the FULL.

Optimize Entity Framework Generated SQL Server Execution Plan

I have a data structure that is basically a document with a dictionary of tags. I am attempting to bring back all documents of a given formtype that have a tag named 'Last Name' and a tag value of 'Smith'. There may be 0..N 'Last Name' tags associated with the document.
I am using the following linq query to try to match a source document to children with matching tags:
DB.Documents
.Where(doc => doc.FormID == pd.IndexForm.FormID)
.Where(doc => doc.Document_StringIndex_ReadOnly
.Join(Fields,
dsi => new { FieldName = dsi.FieldName, FieldValue = dsi.StringValue },
dsi2 => new { FieldName = dsi2.FieldName, FieldValue = dsi2.StringValue },
(dsi, dsi2) => dsi.Document).Count() > 0);
Which generates the following query when output using .ToTraceString()
SELECT
[Project1].*
FROM ( SELECT
[Extent1].*
(SELECT
COUNT(cast(1 as bit)) AS [A1]
FROM [dbo].[Document_StringIndex_ReadOnly] AS [Extent2]
INNER JOIN (SELECT [Extent3].*
FROM [dbo].[Document] AS [Extent3]
INNER JOIN [dbo].[Document_StringIndex_ReadOnly] AS [Extent4] ON [Extent3].[DocumentID] = [Extent4].[DocumentID] ) AS [Join1] ON (([Extent2].[FieldName] = [Join1].[FieldName]) OR (([Extent2].[FieldName] IS NULL) AND ([Join1].[FieldName] IS NULL))) AND (([Extent2].[StringValue] = [Join1].[StringValue]) OR (([Extent2].[StringValue] IS NULL) AND ([Join1].[StringValue] IS NULL)))
LEFT OUTER JOIN [dbo].[Document] AS [Extent5] ON [Extent2].[DocumentID] = [Extent5].[DocumentID]
WHERE ([Extent1].[DocumentID] = [Extent2].[DocumentID]) AND ([Join1].[DocumentID1] = #p__linq__7) AND ([Join1].[FieldName] = #p__linq__8)) AS [C1]
FROM [dbo].[Document] AS [Extent1]
WHERE [Extent1].[FormID] = #p__linq__5
) AS [Project1]
WHERE [Project1].[C1] > 0
If I do a direct substitution of constants for my parameters (as shown below) the query executes very quickly. However, if I leave the parameters in place the query takes several minutes.
SELECT
[Project1].*
FROM ( SELECT
[Extent1].*
(SELECT
COUNT(cast(1 as bit)) AS [A1]
FROM [dbo].[Document_StringIndex_ReadOnly] AS [Extent2]
INNER JOIN (SELECT [Extent3].*
FROM [dbo].[Document] AS [Extent3]
INNER JOIN [dbo].[Document_StringIndex_ReadOnly] AS [Extent4] ON [Extent3].[DocumentID] = [Extent4].[DocumentID] ) AS [Join1] ON (([Extent2].[FieldName] = [Join1].[FieldName]) OR (([Extent2].[FieldName] IS NULL) AND ([Join1].[FieldName] IS NULL))) AND (([Extent2].[StringValue] = [Join1].[StringValue]) OR (([Extent2].[StringValue] IS NULL) AND ([Join1].[StringValue] IS NULL)))
LEFT OUTER JOIN [dbo].[Document] AS [Extent5] ON [Extent2].[DocumentID] = [Extent5].[DocumentID]
WHERE ([Extent1].[DocumentID] = [Extent2].[DocumentID]) AND ([Join1].[DocumentID1] = 1015) AND ([Join1].[FieldName] = 'DDKey')) AS [C1]
FROM [dbo].[Document] AS [Extent1]
WHERE [Extent1].[FormID] = 22
) AS [Project1]
WHERE [Project1].[C1] > 0
After generating an execution plan, I learned that if I directly substitute the parameter values, SQL Server performs an index seek, and my query is fast. As soon as I leave the parameters in place, SQL Server will perform an index scan, and my query times out. Is there any way to prod SQL server to always seek? Can I force entity framework to not use parameterized queries?
In the generated SQL, this line
[Join1].[FieldName] = #p__linq__8
may be the problem.
If FieldName is varchar(...) and #p__linq__8 is nvarchar(...) then this clause will cause a table scan since the parameter type doesn't match the index type.
When you directly substitute 'DDKey' then the types match so you get an index seek. Try your query with N'DDkey' and see if you get a table scan.
This is an issue with various versions of Linq to Sql and Linq to Entities, but may be fixed in later releases.
One way to work around the problem if you can't update to the latest version would be to change FieldName to be nvarchar(...).

Resources