I have a query in SQL Server with 6 JOINs and 1 LEFT JOIN to tables and views. It returns 16k records in about 1 second if the select clause is "SELECT *"
As soon as I specify even one column to display (SELECT ItemID, for example) the query slows down to about 70 seconds.
Query #1 (2s) - SELECT *:
SELECT *
FROM (SELECT LinkedToSet, LinkedToCopy, ',' + STRING_AGG(LocationID,',') + ',' Locs, count(1) OVER (PARTITION BY LinkedToSet) Copies
FROM Inventory.Locations WHERE LinkedToSet is not null AND (State & 4096)>0 GROUP BY LinkedToSet, LinkedToCopy) l
JOIN Bricklink_Set_Query bsq on l.LinkedToSet=bsq.Number
JOIN Bricklink.Set_Parts_Query bsp on l.LinkedToSet=bsp.SetNum AND bsp.Extra=0
JOIN Bricklink.Item_List i on bsp.ItemType=i.ItemType AND bsp.ItemID=i.Number
JOIN Bricklink.Category_List cat on i.Category_ID=cat.CatID
JOIN Bricklink.Color_List col on bsp.ColorID=col.ColorID
LEFT JOIN (SELECT LocationID, ItemType, ItemNum, ColorID, sum(QtyFound) as InvPcs
FROM Inventory.Item_History
GROUP BY LocationID, ItemType, ItemNum, ColorID) as h ON l.Locs like concat('%,',h.locationID,',%') AND h.ItemType=bsp.ItemType AND h.ItemNum=bsp.ItemID AND h.ColorID=bsp.ColorID
Actual Execution Plan: https://www.brentozar.com/pastetheplan/?id=SJD7Qemf_
Query #2 (81s) - SELECT a single column
SELECT bsp.ItemID
FROM (SELECT LinkedToSet, LinkedToCopy, ',' + STRING_AGG(LocationID,',') + ',' Locs, count(1) OVER (PARTITION BY LinkedToSet) Copies
FROM Inventory.Locations WHERE LinkedToSet is not null AND (State & 4096)>0 GROUP BY LinkedToSet, LinkedToCopy) l
JOIN Bricklink_Set_Query bsq on l.LinkedToSet=bsq.Number
JOIN Bricklink.Set_Parts_Query bsp on l.LinkedToSet=bsp.SetNum AND bsp.Extra=0
JOIN Bricklink.Item_List i on bsp.ItemType=i.ItemType AND bsp.ItemID=i.Number
JOIN Bricklink.Category_List cat on i.Category_ID=cat.CatID
JOIN Bricklink.Color_List col on bsp.ColorID=col.ColorID
LEFT JOIN (SELECT LocationID, ItemType, ItemNum, ColorID, sum(QtyFound) as InvPcs
FROM Inventory.Item_History
GROUP BY LocationID, ItemType, ItemNum, ColorID) as h ON l.Locs like concat('%,',h.locationID,',%') AND h.ItemType=bsp.ItemType AND h.ItemNum=bsp.ItemID AND h.ColorID=bsp.ColorID
Actual execution plan: https://www.brentozar.com/pastetheplan/?id=BJTr4x7Gu
The execution plans look totally different from each other and I'm not sure why. I've also tried wrapping the SELECT * and querying that, but some of these tables/views have the exact same field names, especially on the joins, so SQL Server throws an error:
This column 'foo' was specified multiple times.
How do I achieve the performance of SELECT * but limit which columns I display?
P.S. 2 Notes - 1) My desired select statement is obviously more complex than this and 2) Even using the full select statement, if I add a WHERE clause and restrict the query there, it runs in <1 second. If that plan would be useful I can post it as well.
Related
I have a query that yields inconsistent results. I receive a different number of rows almost every time I run the query. I receive no errors. I have tried running the query from all of the servers in the query. I also tried running the query from a fourth unrelated server that is linked to both servers in the query. I tried running each of the CTEs on its own and always get consistent results. Only the last part of the query yields inconsistent results.
Does anyone know what could be causing these inconsistent results? Thanks!
with customerOrderMatches as (
select
SAPOX.docentry
,SAPO.cardcode as 'O CC'
,SAPCMS.CardCode+'-'+SAPCMS.Address as 'OrigShipToID'
,SAPCMB.CardCode+'-'+SAPCMB.Address as 'OrigCustID'
from [Server1].[BowDB].[dbo].ORDR SAPO
join [Server1].[BowDB].[dbo].RDR12 SAPOX
on SAPO.docentry=SAPOX.docentry
left outer join [Server1].[BowDB].[dbo].CRD1 SAPCMS --ship to match
on SAPCMS.cardcode=SAPO.cardcode
and SAPCMS.Address=SAPO.Shiptocode
and SAPCMS.AdresType='S'
and SAPCMS.Street=SAPOX.StreetS
left outer join [Server1].[BowDB].[dbo].CRD1 SAPCMB --bill to match
on SAPCMB.cardcode=SAPO.cardcode
and SAPCMB.Address=SAPO.PayToCode
and SAPCMB.AdresType='B'
and SAPCMB.Street=SAPOX.StreetB
where
SAPO.cardcode NOT IN (
'1001', '1002', '1003'
)
AND SAPO.canceled = 'N'
),
customerRank as (
SELECT
rtrim(C.custid) AS 'custid'
,COUNT(SLShipper.ShipperID) as 'totalShippers'
,Row_number() OVER (ORDER BY COUNT(SLShipper.ShipperID) DESC) AS 'customerRank'
FROM [MLSQL12].[SLapplication15].dbo.customer C
left outer join [MLSQL12].[SLapplication15].dbo.SOShipHeader SLShipper
on SLShipper.custid=C.custid
GROUP BY C.CustID
),
customerShipToRank as (
SELECT
rtrim(SOA.CustID) AS 'custid'
,rtrim(SOA.ShiptoID) as 'shiptoid'
,COUNT(SLShipper.ShipperID) as 'totalShippers'
,cast(Row_number() OVER(Partition by SOA.custid ORDER BY COUNT(SLShipper.ShipperID) DESC) as int) AS 'ShipToRank'
,customerRank
FROM [MLSQL12].[SLapplication15].dbo.soaddress SOA
left outer join [MLSQL12].[SLapplication15].dbo.SOShipHeader SLShipper
on SLShipper.CustID=SOA.CustId
and SLShipper.ShiptoID=SOA.shiptoid
join customerRank CR
on CR.custid=SOA.CustID
GROUP BY
SOA.CustID
,SOA.ShiptoID
,customerRank
),
combinedData as (
select
COM.Docentry
,CXR.*
,CSTR.*
from customerOrderMatches COM
join MLSQL15.HistoricalData.Hist.CustomerXRef CXR
on CXR.OrigShipToID=COM.OrigShipToID collate SQL_Latin1_General_CP850_CI_AS
and CXR.OrigCustID=COM.OrigCustID collate SQL_Latin1_General_CP850_CI_AS
left outer join customerShipToRank CSTR
on CSTR.shiptoid =CXR.BKShiptoId
and CSTR.custid =CXR.BKCustId
)
select
*
from combinedData CD
where CONCAT(customerRank,ShipToRank) in (
select MIN(CONCAT(customerRank,ShipToRank))
from combinedData
group by docentry)
order by docentry
Other random facts about the situation:
-I realize that there are probably inefficiencies in my query that could be optimized. However, this should not result in inconsistent results.
-One database is an SAP DB.
-One DB is a Microsoft Dynamics SL DB.
-One DB is our own DB we created for acquisition data.
Update (12/12/2022)
One of the columns returned is a basic primary key of an order in an order table. I ran the query six times and got the following results:
Query Run Number
Total Rows
Lowest DocEntry
Highest DocEntry
1
14509
9
31412
2
14509
9
31412
3
5455
105
31408
4
5448
108
31411
5
14509
9
31412
6
5181
105
31411
SELECT pp.pat_key, MAX(pp.PROV_NPI) [Provider_ID], CONCAT(pp.LAST_NM,' ',pp.FIRST_NM) [Provider_Name]
INTO pat_primary_provider
FROM TRDW.dbo.PATIENT_PROVIDER pp
WHERE IS_PCP=1
AND pat_key IN (SELECT Consumer_ID FROM CareWire0521)
GROUP BY pp.PAT_KEY, pp.last_nm, pp.FIRST_NM;
SELECT ppp.*
INTO ppp1
FROM (SELECT PAT_KEY, MAX(provider_ID) AS maxprov FROM pat_primary_provider GROUP BY PAT_KEY) AS x
INNER JOIN pat_primary_provider AS ppp ON ppp.PAT_KEY = x.PAT_KEY AND ppp.Provider_ID = x.maxprov;
I need to get the results of ppp1 only using one query (no INTO statements) in SQL Server. Please help.
Simply put the first query into a CTE (without the INTO clause). Then select from that.
;WITH pat_primary_provider AS
(
-- The first query goes here
)
-- The second query goes here
But something like below might also return the PAT_KEY's with the maximum PROV_NPI:
SELECT TOP 1 WITH TIES
PAT_KEY,
MAX(PROV_NPI) AS [Max_Provider_ID],
CONCAT(LAST_NM,' ',FIRST_NM) AS [Patient_Provider_Full_Name]
FROM TRDW.dbo.PATIENT_PROVIDER pp
WHERE IS_PCP = 1
AND PAT_KEY IN (SELECT Consumer_ID FROM CareWire0521)
GROUP BY PAT_KEY, LAST_NM, FIRST_NM
ORDER BY row_number() over (order by MAX(PROV_NPI) desc);
Whats wrong with just inserting the first query as subqueries into the second?
SELECT ppp.*
FROM (SELECT PAT_KEY, MAX(provider_ID) AS maxprov FROM (SELECT pp.pat_key, MAX(pp.PROV_NPI) [Provider_ID], CONCAT(pp.LAST_NM,' ',pp.FIRST_NM) [Provider_Name]
FROM TRDW.dbo.PATIENT_PROVIDER pp
WHERE IS_PCP=1
AND pat_key IN (SELECT Consumer_ID FROM CareWire0521)
GROUP BY pp.PAT_KEY, pp.last_nm, pp.FIRST_NM) GROUP BY PAT_KEY) AS x
INNER JOIN (SELECT pp.pat_key, MAX(pp.PROV_NPI) [Provider_ID], CONCAT(pp.LAST_NM,' ',pp.FIRST_NM) [Provider_Name]
FROM TRDW.dbo.PATIENT_PROVIDER pp
WHERE IS_PCP=1
AND pat_key IN (SELECT Consumer_ID FROM CareWire0521)
GROUP BY pp.PAT_KEY, pp.last_nm, pp.FIRST_NM) AS ppp ON ppp.PAT_KEY = x.PAT_KEY AND ppp.Provider_ID = x.maxprov;
I have two queries that I would like to combine. One query is left joining columns in the same table, the other query is left joining columns from two different tables. Both queries have the same table, just unsure how to properly set up the query.
1st Query:
SELECT BIZ_GROUP,
ORDER_ID,
STATION,
A.TC_DATE,
WANT_DATE,
TIME_SLOT,
JOB_CODE,
[ADDRESS],
CITY,
A.TECH_ID,
A.PREMISE,
ISNULL(B.LAST_ARRIVED, A.LAST_ARRIVE) AS ARRIVED,
ORDER_CLOSED,
COMP_STATUS,
WORK_STATUS,
REMARKS,
CORRECTION
FROM MET_timecommit A
LEFT JOIN(SELECT premise,
TC_DATE,
TECH_ID,
MIN(last_arrive) AS LAST_ARRIVED
FROM MET_timecommit
WHERE PREMISE IS NOT NULL
GROUP BY premise,
TC_DATE,
TECH_ID) B ON B.TC_DATE = A.TC_DATE
AND B.PREMISE = A.PREMISE
2nd query:
SELECT *
FROM MET_timecommit
LEFT JOIN (SELECT ORDER_ID,
created,
host_creation,
went_to
FROM workload
WHERE went_to >= getdate()-365) C ON C.went_to=MET_timecommit.TC_DATE
AND C.order_id=MET_timecommit.order_id
Evidently I am not used to this forum. You all don't have to be so rude. TDP was able to help me out based on what I provided. All other comments were unnecessary.
This should bring back the rows for both tables B and C for each row of table A:
SELECT A.BIZ_GROUP,
A.ORDER_ID,
A.STATION,
A.TC_DATE,
A.WANT_DATE,
A.TIME_SLOT,
A.JOB_CODE,
A.[ADDRESS],
A.CITY,
A.TECH_ID,
A.PREMISE,
ISNULL(B.LAST_ARRIVED, A.LAST_ARRIVE) AS ARRIVED,
A.ORDER_CLOSED,
A.COMP_STATUS,
A.WORK_STATUS,
A.REMARKS,
A.CORRECTION,
C.*
FROM MET_timecommit A
LEFT JOIN(SELECT premise,
TC_DATE,
TECH_ID,
MIN(last_arrive) AS LAST_ARRIVED
FROM MET_timecommit
WHERE PREMISE IS NOT NULL
GROUP BY premise,
TC_DATE,
TECH_ID) B ON B.TC_DATE = A.TC_DATE
AND B.PREMISE = A.PREMISE
LEFT JOIN (SELECT ORDER_ID,
created,
host_creation,
went_to
FROM workload
WHERE went_to >= getdate()-365) C ON C.went_to=A.MET_timecommit.TC_DATE
AND C.order_id=A.MET_timecommit.order_id
I am relatively new at SQL so I apologise if this is obvious but I cannot work out how to use the results of the WITH clause query in the where statement of my main query.
My with query pulls the first record for each customer and gives the sale date for that record:
WITH summary AS(
SELECT ed2.customer,ed2.saledate,
ROW_NUMBER()OVER(PARTITION BY ed2.customer
ORDER BY ed2.saledate)AS rk
FROM Filteredxportdocument ed2)
SELECT s.*
FROM summary s
WHERE s.rk=1
I need to use the date in the above query as the starting point and pull all records for each customer for their first 12 months i.e. where the sale date is between ed2.saledate AND ed2.saledate+12 months.
My main query is:
SELECT ed.totalamountincvat, ed.saledate, ed.name AS SaleRef,
ed.customer, ed.customername, comp.numberofemployees,
comp.companyuid
FROM exportdocument AS ed INNER JOIN
FilteredAccount AS comp ON ed.customer = comp.accountid
WHERE (ed.statecode = 0) AND
ed.saledate BETWEEN ed2.saledate AND DATEADD(M,12,ed2.saledate)
I am sure that I need to add the main query into the WITH clause but I cant work out where. Is anyone able to help please
Does this help?
;WITH summary AS(
SELECT ed2.customer,ed2.saledate,
ROW_NUMBER()OVER(PARTITION BY ed2.customer
ORDER BY ed2.saledate)AS rk
FROM Filteredxportdocument ed2)
SELECT ed.totalamountincvat, ed.saledate, ed.name AS SaleRef,
ed.customer, ed.customername, comp.numberofemployees,
comp.companyuid
FROM exportdocument AS ed INNER JOIN
FilteredAccount AS comp ON ed.customer = comp.accountid
OUTER APPLY (SELECT s.* FROM summary s WHERE s.rk=1) ed2
WHERE ed.statecode = 0 AND
ed.saledate BETWEEN ed2.saledate AND DATEADD(M,12,ed2.saledate)
and ed.Customer = ed2.Customer
Results of CTE are not cached or stored, so you can't reuse it.
EDIT:
Based upon your requirement that all the records from CTE should be in final result, this is a new query:
;WITH summary AS(
SELECT ed2.customer,ed2.saledate,
ROW_NUMBER()OVER(PARTITION BY ed2.customer
ORDER BY ed2.saledate)AS rk
FROM Filteredxportdocument ed2)
SELECT
ed.totalamountincvat,
ed.saledate,
ed.name AS SaleRef,
ed.customer,
ed.customername,
comp.numberofemployees,
comp.companyuid
FROM
summary ed2
left join exportdocument ed
on ed.Customer = ed2.Customer
and ed.statecode = 0
AND ed.saledate BETWEEN ed2.saledate AND DATEADD(M,12,ed2.saledate)
INNER JOIN FilteredAccount comp
ON ed.customer = comp.accountid
WHERE
s.rk=1
summary you will be able to use only once. Alternate solution is store summary into temp table and use that as many times as u want.
Something like : Select * into #temp from Summary s where s.rk=1
I'm pulling my hair out over a subquery that I'm using to avoid about 100 duplicates (out of about 40k records). The records that are duplicated are showing up because they have 2 dates in h2.datecreated for a valid reason, so I can't just scrub the data.
I'm trying to get only the earliest date to return. The first subquery (that starts with "select distinct address_id", with the MIN) works fine on it's own...no duplicates are returned. So it would seem that the left join (or just plain join...I've tried that too) couldn't possibly see the second h2.datecreated, since it doesn't even show up in the subquery. But when I run the whole query, it's returning 2 values for some ipc.mfgid's, one with the h2.datecreated that I want, and the other one that I don't want.
I know it's got to be something really simple, or something that just isn't possible. It really seems like it should work! This is MSSQL. Thanks!
select distinct ipc.mfgid as IPC, h2.datecreated,
case when ad.Address is null
then ad.buildingname end as Address, cast(trace.name as varchar)
+ '-' + cast(trace.Number as varchar) as ONT,
c.ACCOUNT_Id,
case when h.datecreated is not null then h.datecreated
else h2.datecreated end as Install
from equipmentjoin as ipc
left join historyjoin as h on ipc.id = h.EQUIPMENT_Id
and h.type like 'add'
left join circuitjoin as c on ipc.ADDRESS_Id = c.ADDRESS_Id
and c.GRADE_Code like '%hpna%'
join (select distinct address_id, equipment_id,
min(datecreated) as datecreated, comment
from history where comment like 'MAC: 5%' group by equipment_id, address_id, comment)
as h2 on c.address_id = h2.address_id
left join (select car.id, infport.name, carport.number, car.PCIRCUITGROUP_Id
from circuit as car (NOLOCK)
join port as carport (NOLOCK) on car.id = carport.CIRCUIT_Id
and carport.name like 'lead%'
and car.GRADE_Id = 29
join circuit as inf (NOLOCK) on car.CCIRCUITGROUP_Id = inf.PCIRCUITGROUP_Id
join port as infport (NOLOCK) on inf.id = infport.CIRCUIT_Id
and infport.name like '%olt%' )
as trace on c.ccircuitgroup_id = trace.pcircuitgroup_id
join addressjoin as ad (NOLOCK) on ipc.address_id = ad.id
The typical approach to only getting the lowest row is one of the following. You didn't bother to specify what version of SQL Server you're using, what you want to do with ties, and I have little interest to try to work this into your complex query, so I'll show you an abstract simplification for different versions.
SQL Server 2000
SELECT x.grouping_column, x.min_column, x.other_columns ...
FROM dbo.foo AS x
INNER JOIN
(
SELECT grouping_column, min_column = MIN(min_column)
FROM dbo.foo GROUP BY grouping_column
) AS y
ON x.grouping_column = y.grouping_column
AND x.min_column = y.min_column;
SQL Server 2005+
;WITH x AS
(
SELECT grouping_column, min_column, other_columns,
rn = ROW_NUMBER() OVER (ORDER BY min_column)
FROM dbo.foo
)
SELECT grouping_column, min_column, other_columns
FROM x
WHERE rn = 1;
This subqery:
select distinct address_id, equipment_id,
min(datecreated) as datecreated, comment
from history where comment like 'MAC: 5%' group by equipment_id, address_id, comment
Probably will return multiple rows because the comment is not guaranteed to be the same.
Try this instead:
CROSS APPLY (
SELECT TOP 1 H2.DateCreated, H2.Comment -- H2.Equipment_id wasn't used
FROM History H2
WHERE
H2.Comment LIKE 'MAC: 5%'
AND C.Address_ID = H2.Address_ID
ORDER BY DateCreated
) H2
Switch that to OUTER APPLY in case you want rows that don't have a matching desired history entry.