Merge Join Operation in SQL Query costing 60%+ of query - sql-server

I am trying to optimize a query that looks pretty simple but the merge query operation is taking the lion's share of the query time. I only am receiving about 1600 results a run and it's taking 25 seconds. I am trying to get the cost and time of this down by 50%. I have tried to use the top 2000 trick to narrow down results and I made sure we are scanning all indexes. I know that seeks would be better but the scans are under 1% of the query.
Here is the plan and SQL.
https://www.brentozar.com/pastetheplan/?id=Ske3IIOQZ
OrderApprovals_View
SQL for the view:
SELECT dbo.SVC_OrderApprovals.ApprovalID ,
dbo.SVC_OrderApprovals.OrderID ,
dbo.SVC_OrderApprovals.CLSItem ,
dbo.SVC_OrderApprovals.CustNum ,
dbo.SVC_OrderApprovals.ReqQuantity ,
dbo.SVC_OrderApprovals.ReqShipDate ,
dbo.SVC_OrderApprovals.UM ,
dbo.SVC_OrderApprovals.CSR ,
dbo.SVC_OrderApprovals.ShipLoc ,
dbo.SVC_OrderApprovals.ApprovalType ,
dbo.SVC_OrderApprovals.ApprovalGroup ,
dbo.SVC_OrderApprovals.CreatedBy ,
dbo.SVC_OrderApprovals.CreateDate ,
dbo.SVC_OrderApprovals.CreatedPC ,
dbo.SVC_OrderApprovals_SC_Notes.SC_Notes ,
dbo.SVC_OrderApprovals_Status.Status ,
CASE WHEN [Status] <> 'NEW' THEN SVC_OrderApprovals_Status.CreatedBy
ELSE NULL
END AS ReviewedBy ,
CASE WHEN [Status] <> 'NEW' THEN SVC_OrderApprovals_Status.CreateDate
ELSE NULL
END AS ReviewDate ,
CASE WHEN [Status] <> 'NEW' THEN SVC_OrderApprovals_Status.CreatedPC
ELSE NULL
END AS ReviewedPC ,
dm_Core.dbo.core_CLSItemData.ItemDescription ,
dm_Core.dbo.core_CustMaster.CustName ,
dbo.SVC_OrderApprovals.SCTeam ,
dbo.SVC_OrderApprovals_Status.CreateDate AS StatusCreateDate ,
dbo.SVC_OrderApprovals_SC_Notes.CreateDate AS SCNoteCreateDate ,
dbo.SVC_OrderApprovals_Approval_Notes.ApprovalNote ,
dbo.SVC_OrderApprovals_Approval_Notes.CreateDate AS ApprovalNoteCreateDate
FROM dbo.SVC_OrderApprovals_Current
INNER JOIN dbo.SVC_OrderApprovals ON dbo.SVC_OrderApprovals_Current.RecordID = dbo.SVC_OrderApprovals.RecordID
LEFT OUTER JOIN dbo.SVC_OrderApprovals_Approval_Notes ON dbo.SVC_OrderApprovals.ApprovalID = dbo.SVC_OrderApprovals_Approval_Notes.ApprovalID
LEFT OUTER JOIN dm_Core.dbo.core_CustMaster ON dbo.SVC_OrderApprovals.CustNum = dm_Core.dbo.core_CustMaster.CustNum
LEFT OUTER JOIN dbo.SVC_OrderApprovals_Status ON dbo.SVC_OrderApprovals.ApprovalID = dbo.SVC_OrderApprovals_Status.ApprovalID
LEFT OUTER JOIN dbo.SVC_OrderApprovals_SC_Notes ON dbo.SVC_OrderApprovals.ApprovalID = dbo.SVC_OrderApprovals_SC_Notes.ApprovalID
LEFT OUTER JOIN dm_Core.dbo.core_CLSItemData ON dbo.SVC_OrderApprovals.CLSItem = dm_Core.dbo.core_CLSItemData.CLSItem;

Related

Filter by date on view is very slow

I have an inkling what's causing the slowdown. The issue is that I'm not sure what the correct solution.
I have the following underlying query in my view (named vMemberListByDate):
select
mpi.membermpi
, mpi.accountnumber
, mpi.MemberKey
, membergroupkey = m.groupkey
, mpi.accountkey
, cr.CreditScore
, m.HashedSSN
, OpenDate = mpi.AccountOpenDate
, Closedate = mpi.AccountCloseDate
, Tenure = DATEDIFF(MONTH, mpi.AccountOpenDate, mpi.SnapshotDate)
, AgeIndex = db.Sort
, mpi.SnapshotDate
, ShareBalanceAmt = fmad.TotalShareBalance
, cr.TotalUnsecuredBalance
, CreditLineBalance = fmad.CreditLineLoanBalance
, MortgageBalance = fmad.FirstMortgageLoanBalance + fmad.SecondMortgageLoanBalance
, cr.RevolvingBalance
, IsActive
, fmad.LastChargeOffDate
, fmad.TotalChargeOffCnt
into
#Results
from
EDW.Global.vFactMemberMPIDaily as mpi (nolock)
inner join
EDW.Global.DimMember as m (nolock) on m.MemberKey = mpi.MemberKey
inner join
EDW.Global.FactMemberAccountDaily as fmad (nolock) on fmad.AccountKey = mpi.AccountKey
and fmad.SnapshotDate = mpi.SnapshotDate
left join
EDW.Global.DimBand as db (nolock) on isnull(datediff(year, BirthDate, mpi.SnapshotDate), -1) between db.LowValue and db.HighValue
and db.GroupDescription = 'Age 2'
left join
EDW.Global.DimCredit cr (nolock) on cr.HashedSSN = mpi.HashedSSN
and mpi.SnapshotDate between cr.StartDate and cr.EndDate
I select from the view as follows:
select count(*)
from ML.vMemberListByDate
where SnapshotDate = '11/1/2022'
It runs excruciatingly slow at 26 minutes total.
However, if I apply this date filter directly in the view's query code, it runs in 10 seconds. My assumption for the slowdown is that the view returns every row possible before filtering, therefore causing the slowdown.
Is there anyway around this?

T-SQL multiple subqueries with JOIN and MAX(date)

I have this query with 2 LEFT JOINS each with a sub-query.
The sub-queries should select the row with the latest date PointsChangeDate__c for a specific campaign e.g: PointsTypeCode__c = '1'.
Problem is that it is choosing what appears to be just random dates.
If I run just one JOIN / sub-query, then the result is correct, but when I add the 2nd JOIN / sub-query, then the results are incorrect and it seems to be pulling random dates.
I suspect my issue is with the type of JOIN I am using, but I cannot see why, because if I LEFT JOIN, then I am including all results (a._ContactKey) pulled from the first query and carrying that all the way through.
SELECT
a._ContactKey ContactId,
a.Full_Name_1__c Full_Name,
a.MC_Phone__c Mobile,
a.POSCustomerNumber__c POS_Customer_Number,
a.POSCustomerStatus__c POSCustomerStatus,
'IL' Locale,
a.Mobile_1__c Mobile_for_text,
g.TotalPoints__c TotalPoints_SourceMethod_1,
g.ValidUntil__c ValidUntil_SourceMethod_1,
g.date_1 PointsChangeDate__c_1,
h.TotalPoints__c TotalPoints_SourceMethod_2,
h.ValidUntil__c ValidUntil_SourceMethod_2,
h.date_2 PointsChangeDate__c_2
FROM
Contact_Salesforce AS a
LEFT JOIN
(SELECT
Contact__c,
TotalPoints__c,
ValidUntil__c,
MAX(PointsChangeDate__c) date_1
FROM
Member_points__c_Salesforce
WHERE
PointsTypeCode__c = '1'
GROUP BY Contact__c , TotalPoints__c , ValidUntil__c, PointsChangeDate__c) AS g ON a._ContactKey = g.Contact__c
LEFT JOIN
(SELECT Contact__c,
TotalPoints__c,
ValidUntil__c,
MAX(PointsChangeDate__c) date_2
FROM
Member_points__c_Salesforce
WHERE
PointsTypeCode__c = '2'
GROUP BY Contact__c , TotalPoints__c , ValidUntil__c, PointsChangeDate__c) AS h ON h.Contact__c = g.Contact__c
LEFT JOIN
SMS_Unsubscribe AS c ON REPLACE(CONCAT('972',
RIGHT(RTRIM(LTRIM(a.Mobile_1__c)), 9)),
'-',
'') = c.Mobile
LEFT JOIN
Member_Segments__c_Salesforce AS b ON b.Contact__c = a._ContactKey
WHERE
a.Mobile_1__c IS NOT NULL
AND c.Mobile IS NULL
AND a.POSCustomerStatus__c = '0'
AND b.IsActive__c = 'true'
GROUP BY a._ContactKey , a.Full_Name_1__c , a.MC_Phone__c , a.POSCustomerNumber__c , a.POSCustomerStatus__c , Locale , a.Mobile_1__c , a.Mobile_1__c , b.SegmentTypeID__c, b.SegmentTypeDescription__c , b.ToDate__c , g.TotalPoints__c , g.ValidUntil__c, g.date_1
,h.TotalPoints__c , h.ValidUntil__c, h.date_2

Resulting nulls on full join not being replaced

I have a set of select queries using full join (required) and would like to replace the resulting nulls with something else (in the following example, it should be "empty").
For the first column (and all others, honestly) I have tried using isnull(), coallesce(), case when and even try_convert, but the result is always null. I'm ok with null, as in this particular case means that the results from the first query don't exist the second query, which is my goal.
There are following, identical queries, also full join 'd, so a line in the first query may not be in the second query but may be in the third of fourth queries.
Here is the select statement
SELECT *
FROM (SELECT Isnull(1, 'empty') AS SubGroup
, table2.lineintid AS OrderByThis2nd
, table2.HeaderStamp AS HeaderLink
, table2.linestamp AS LineID
, table2.lprocessname AS LineProcName
, table2.lprocessno AS LineProcNumber
, table2.productid AS ProdId
, table2.prodamount AS QTT
, table2.prodval AS UnitPrice
FROM table2 (nolock)
INNER JOIN table1 ON table2.headerstamp = table1.headerstamp
WHERE table1.lprocessname = 'Phase 1')Proc1L
FULL JOIN (SELECT Isnull(2, 'empty') AS SubGroup
, table2.lineintid AS OrderByThis2nd
, table2.linestamp AS LineID
, table2.prevlstamp AS PrecedingLine
, table2.lprocessname AS LineProcName
, table2.lprocessno AS LineProcNumber
, table2.productid AS ProdId
, table2.prodamount AS QTT
, table2.prodval AS UnitPrice
FROM table2 (nolock)
INNER JOIN table1 ON table2.headerstamp = table1.headerstamp
WHERE table1.lprocessname = 'Phase 2'
AND Year(table2.linedate) = '2018')Proc2L ON Proc1L.LineID = Proc2L.PrecedingLine
ORDER BY 1 DESC
, 2
This database is in MS SQL 2014.
Any ideas are appreciated. Thank you very much!
Try using ISNULL function in the outer query. Instead of
select * from
use
Select isnull(col1, 'x'), etc
from

SQL CASE mixes up my grouping

The problem situates itself at the 4th line of the SELECT statement: CASE WHEN ct.TransactionReason=622 THEN ABS(ct.netquantity) ELSE c.RealNetWeight END AS NetWeight
When I add this line to the statement, my grouping will change. Instead of returning one line it now gives me back the amount of lines of different c.realnetweight.
Problem is that I only want to return one line. Sort of like a coalesce that when there is a ct.transactionreason = 622, it should give me ABS(ct.netquantity), otherwise the c.realnetweight. Code can be found beneath, suggestions would be very helpful. Thanks.
SELECT CASE WHEN P.Wrapped = 1 THEN T.[Level]+1 ELSE T.[Level] END AS [Level]
, #CoilId AS CoilId
, c.SupplierCoilID
, CASE WHEN ct.TransactionReason=622 THEN ABS(ct.netquantity) ELSE c.RealNetWeight END AS NetWeight
, C.RealGrossWeight
, p1.Description
, p1.product
, s.StackID
, s.ProductID
, s.Weight
, P.Product
, P.Description AS 'ProductDescription'
, COUNT(t.BlankId) AS 'NumberOfBlanks'
, c1.Description as 'Status'
, pv.ProductionWeight
, pv.BlankWeight
, t.BlankStatus
FROM #Trace T
INNER JOIN SKUTraceability SKUT ON SKUT.SKUID = T.SKUID
INNER JOIN Stack s ON SKUT.StackID = s.StackID
INNER JOIN Product p ON s.ProductID = p.ProductID
INNER JOIN Coil c ON c.CoilID=#CoilId
INNER JOIN CoilTransaction ct on ct.CoilID=#CoilId
INNER JOIN Product p1 ON c.ProductID=p1.ProductID
INNER JOIN Code c1 ON t.BlankStatus=c1.codenumber AND c1.codetypeid=17
INNER JOIN #ProductVersion pv ON pv.ProductID=p.ProductId AND s.ProductVersion = pv.ProductVersion
WHERE t.BlankId IS NOT NULL
GROUP BY T.[Level]
, c.SupplierCoilID
, CASE WHEN ct.TransactionReason=622 THEN ABS(ct.netquantity) ELSE c.RealNetWeight END
, c.RealGrossWeight
, p1.Description
, p1.product
, s.StackID
, s.ProductID
, s.Weight
, p.Product
, p.Description
, c1.Description
, pv.ProductionWeight
, pv.BlankWeight
, p.Wrapped
, t.BlankStatus
Hard to answer without understanding your table structures however, it appears that CoilTransaction is some sort of transaction table, i.e. a single product can have many transactions.
In your SELECT query, the line causing you issues, is the only line that references your CoilTransaction table therefore I believe, the reason you're returning multiple rows is because you're grouping on a value that is not unique. Also, are transactions individual items because you seem to have a quantity column on the table.
In short, you can't get the grouping you want by including those columns from your transaction table. You would need to elaborate more on what you're trying to accomplish for us to give a more suitable solution. What does that line mean?
For at least one CoilID in table Coil, you will have more than one value in the field netquantity in the table CoilTransaction. This is what is causing the increase in the number of records returned when you include this field in your CASE statement.
I would recommend finding the netquantity value that you want from CoilTransaction in a CTE, and then bringing this in to your CASE statement. For example:
;WITH transaction_summary AS (
SELECT
ct.CoilID,
ct.TransactionReason,
MAX(ct.netquantity) -- choose your aggregate function here
FROM
CoilTransaction ct
GROUP BY
ct.CoilID,
ct.TransactionReason
)
...

how to join sql tables in my desired way

I have 2 sql tables named mtblAttendance and mtblLeave_Data.
I need to get the all dates from mtblLeave_Data when User was on leave depending on absent in mtblAttendance.
In my mtblAttendance for every leave there is a row, but if a user on leave for a period so there is no unique row, there are just two columns Leave_From and Leave_To (or it may be a single entry where Leave_From= Leave_To).
For getting the absent dates of user I wrote the query
USE [ILeave]
ALTER procedure [dbo].[Attendance_Report]
#Date1 datetime,
#Date2 datetime,
#User_Id nvarchar(50)
as begin
SELECT distinct
a.Sno,
a.[Login_Date],
a.[Week_Day],
a.[In_Time],
a.[Out_Time],
a.Attendance_Status,
a.Half_Full,
a.Leave_Type,
(convert(varchar(max),floor (abs(cast(datediff(mi, a.Out_Time, a.In_Time) AS int) / 60)))+ '.'+ convert(varchar(max),(abs(cast(datediff(mi, a.Out_Time, a.In_Time) AS int) % 60)))) as Hrs
, l.[Sno]
, l.[Leave_ID]
, l.[User_Id]
, l.[Dept_To]
, l.[Leave_Type]
, l.[Total_Leave_HR]
, l.[Leave_From]
, l.[Leave_To]
, l.[Leave_Half_Full]
, l.[Comments]
, l.[Leave_Status]
FROM
[mtblAttendance] a
LEFT JOIN [mtbl_Leave_Data] l
ON a.[Login_Date] BETWEEN l.[Leave_From] AND l.[Leave_To]
AND l.[User_Id] = a.[User_Id] where a.Login_Date between #Date1 and #Date2 and a.User_Id=#User_Id order by Login_Date
end
The following query should return the leave record assigned
SELECT
a.[Login_Date]
, l.[Sno]
, l.[Leave_ID]
, l.[User_Id]
, l.[Dept_To]
, l.[Leave_Type]
, l.[Total_Leave_HR]
, l.[Leave_From]
, l.[Leave_To]
, l.[Leave_Half_Full]
, l.[Comments]
, l.[Leave_Status]
FROM
[mtblAttendance] a
LEFT JOIN [mtbl_Leave_Data] l ON a.[Login_Date] BETWEEN l.[Leave_From] AND l.[Leave_To]
WHERE
a.User_Id = 'sasi'
AND a.Attendance_Status='A'
I put it into a fiddle, but with no data so all I can say is that the query parses.
As someone has previously stated, it is common to have tables with dates in, whereby queries requiring every date in a 2 year period can quickly be assessed.
Updated SQL:
SELECT DISTINCT
a.[Login_Date]
, l.[Sno]
, l.[Leave_ID]
, l.[User_Id]
, l.[Dept_To]
, l.[Leave_Type]
, l.[Total_Leave_HR]
, l.[Leave_From]
, l.[Leave_To]
, l.[Leave_Half_Full]
, l.[Comments]
, l.[Leave_Status]
FROM
[mtblAttendance] a
LEFT JOIN [mtbl_Leave_Data] l
ON a.[Login_Date] BETWEEN l.[Leave_From] AND l.[Leave_To]
AND l.[userId] = a.[user_id] -- Ensure only attendance/leave for the same user being linked
WHERE
a.User_Id = 'sasi'
AND a.Attendance_Status='A'
Join expressions aren't limited to using the equals sign. Use "between" in the join expression. Something along these (untested) lines should work.
select distinct A.Login_Date
from mtblAttendance A
inner join mtbl_Leave_Data L
on A.User_id = L.User_id
and A.Login_date between L.Leave_From and L.Leave_To
where A.User_Id = 'sasi' AND A.Attendance_Status='A'
Depending on what you're trying to do, you might need to change the inner join to a left outer join. A left outer join will preserve all login dates from mtblAttendance, regardless of whether they satisfy the join condition. (Those rows will be filtered by the WHERE clause, of course.)

Resources