Optimise query with count and order by function - sql-server

I have a problem with the optimization of this query, I have 3 tables (Products = Catalogo.GTIN, Sales Header = TEDEF.Factura and Sales Detail = TEDEF.Farmacia).
The query tries to find the Mode of the column VPRODEXENIGV_FAR. This query without the ORDER BY executes in less than 3 seconds (the table of details has about 30 million rows).
But when I add the ORDER BY clause, the query now takes more than 30 minutes to run.
I want to know how can I optimize this query or the indexes that I need to optimize this.
SELECT *
FROM Catalogo.GTIN G
CROSS APPLY
(SELECT TOP 1
COUNT(FAR.VPRODEXENIGV_FAR) [ROW],
YEAR(FAC2.VFECEMI_FAC) [AÑO],
MONTH(FAC2.VFECEMI_FAC) [MES],
FAR.VCODPROD_FAR_003,
CASE WHEN FAR.VPRODEXENIGV_FAR = 'A' THEN 1 ELSE 0 END AfectoIGV
FROM
TEDEF.Factura FAC2
INNER JOIN
TEDEF.Farmacia FAR ON FAC2.VTDOCPAGO_FAC = FAR.VTDOCPAGO_FAC
AND FAC2.VNDOCPAGO_FAC = FAR.VNDOCPAGO_FAC
WHERE
G.CODIGO = FAR.VCODPROD_FAR_003
GROUP BY
YEAR(FAC2.VFECEMI_FAC),
MONTH(FAC2.VFECEMI_FAC),
FAR.VCODPROD_FAR_003,
FAR.VPRODEXENIGV_FAR
ORDER BY
1 DESC --- <----- THE PROBLEM IS HERE
) GG

Ouch! You have a hugely expensive dependent subquery. It's expensive because SELECT TOP(n) ... ORDER BY col DESC does a whole lot of work to create a result set only to discard all but one row. And, it's a dependent subquery so it's run for every row of Catalogo.GTIN .
It looks like you want to count the resultset rows in the most recent month and year for each Catalogo.GTIN row. So, let's try to refactor your query to do that.
We'll start with a subquery to grab the month-start date of the latest Factura row for each catalog entry.
SELECT CODIGO,
DATEFROMPARTS(YEAR(maxd), MONTH(maxd),1) maxmes
FROM (
SELECT MAX(FAC2.VFECEMI_FAC) maxd,
G.CODIGO
FROM Catalogo.GTIN G
JOIN TDEF.Farmacia FAR
ON G.CODIGO = FAR.VCODPROD_FAR_003
JOIN TEDEF.Factura FAC2
ON FAC2.VTDOCPAGO_FAC = FAR.VTDOCPAGO_FAC
AND FAC2.VNDOCPAGO_FAC = FAR.VNDOCPAGO_FAC
GROUP BY G.CODIGO
) maxd
It's wise to test this and make sure it works correctly and performs tolerably well. If you test it in SSMS, you can use "Show Actual Execution Plan" and see if it recommends an extra index. This subquery need only be run once, rather than once per G.CODIGO row.
Then we'll use it in your larger query.
SELECT G.*,
COUNT(FAR.VPRODEXENIGV_FAR) [ROW],
YEAR(FAC2.VFECEMI_FAC) [AÑO],
MONTH(FAC2.VFECEMI_FAC) [MES],
FAR.VCODPROD_FAR_003,
CASE WHEN FAR.VPRODEXENIGV_FAR = 'A' THEN 1 ELSE 0 END AfectoIGV
FROM Catalogo.GTIN G
JOIN (
SELECT CODIGO,
DATEFROMPARTS(YEAR(maxd), MONTH(maxd),1) maxmes
FROM (
SELECT MAX(FAC2.VFECEMI_FAC) maxd,
G.CODIGO
FROM Catalogo.GTIN G
JOIN TDEF.Farmacia FAR
ON G.CODIGO = FAR.VCODPROD_FAR_003
JOIN TEDEF.Factura FAC2
ON FAC2.VTDOCPAGO_FAC = FAR.VTDOCPAGO_FAC
AND FAC2.VNDOCPAGO_FAC = FAR.VNDOCPAGO_FAC
GROUP BY G.CODIGO
) maxd
) maxmes ON G.CODIGO = maxmes.CODIGO
JOIN TEDEF.Farmacia FAR
ON G.CODIGO = FAR.VCODPROD_FAR_003
JOIN TEDEF.Factura FAC2
ON FAC2.VTDOCPAGO_FAC = FAR.VTDOCPAGO_FAC
AND FAC2.VNDOCPAGO_FAC = FAR.VNDOCPAGO_FAC
AND FAC2.VFECEMI_FAC >= maxmes.maxmes
GROUP BY maxmes.maxmes,
G.CODIGO,
FAR.VCODPROD_FAR_003,
FAR.VPRODEXENIGV_FAR
Here is the tricky bit:
DATEFROMPARTS(YEAR(maxd), MONTH(maxd),1) maxmes turns any date maxd into the first day of that month.
And, FAC2.VFECEMI_FAC >= maxmes.maxmes filters out rows before the first day of that month (for that CODIGO). It does so in a sargable way: a way that can exploit an index on FAC2.VFECEMI_FAC.
That is an alternative way to do TOP(1) ORDER BY d DESC. And faster.
It's all about sets of rows. Especially when using GROUP BY, it's performance-helpful to limit the number of rows in each set.
Obviously I cannot debug this.

Is me again, Finally i resolve the problem of the optimization, now the query delay is about 20 sec (with the sort instruction and with the count in a table over 30 million rows) i hope this way can help others or could be optimice more by the community.
I resolve the problem applying the sort but with the Row_Number instruction, in that way the server take my index for the sort instruction and make the magic:
WITH x
AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY GG.COD, GG.[AÑO], GG.[MES] ORDER BY GG.[ROW] DESC) [ID]
FROM Catalogo.GTIN G
CROSS APPLY
(
SELECT COUNT(FAR.VPRODEXENIGV_FAR) [ROW]
, YEAR(FAC2.VFECEMI_FAC) [AÑO]
, MONTH(FAC2.VFECEMI_FAC) [MES]
, FAR.VCODPROD_FAR_003 [COD]
, CASE WHEN FAR.VPRODEXENIGV_FAR = 'A' THEN 1 ELSE 0 END AfectoIGV
FROM TEDEF.Factura FAC2
INNER JOIN TEDEF.Farmacia FAR
ON FAC2.VTDOCPAGO_FAC = FAR.VTDOCPAGO_FAC
AND FAC2.VNDOCPAGO_FAC = FAR.VNDOCPAGO_FAC
WHERE G.CODIGO = FAR.VCODPROD_FAR_003
GROUP BY YEAR(FAC2.VFECEMI_FAC)
, MONTH(FAC2.VFECEMI_FAC)
, FAR.VCODPROD_FAR_003
, FAR.VPRODEXENIGV_FAR
-- ORDER BY 1 DESC --- <---- this is the bad guy, please, don't do that xD
) GG
) SELECT *
FROM x WHERE ID = 1
In that way i can sort the Count instruction and calculate the Mode for the Column FAR.VPRODEXENIGV_FAR

Related

Replacement for UPDATE statement with ORDER BY clause

I am having a hard time trying to execute an update query that should contain ORDER BY clause, but I'm unable to find a proper solution yet.
UPDATE I
SET RefItemID = AQ.ID,
I.MagParamNum = AQ.MagParamNum
FROM SRO_VT_SHARD.._Items I
JOIN SRO_VT_SHARD.._Inventory INV ON INV.ItemID = I.ID64
JOIN SRO_VT_SHARD.._RefObjCommon ROC ON ROC.ID = I.RefItemID
JOIN _AEQItems AQ ON AQ.TypeID3 = ROC.TypeID3
AND AQ.TypeID4 = ROC.TypeID4
WHERE
INV.Slot BETWEEN 0 AND 13
AND INV.Slot != 8
AND AQ.ReqLevel1 <= #Data2
AND INV.CharID = #CharID
ORDER BY AQ.ReqLevel1 DESC
Basically my query should work this way if ORDER BY clause is usable inside an update statement, but it doesn't. Is there something I can do which should solve this?
Thanks a lot in advance.
You need to determine the exact row to update for each TypeID3 / TypeID4 combination, and you can't do that in the outer query. You may need to add additional ORDER BY clauses here to break ties. You may also want to specify only a subset of columns if you have an index that covers the columns in _AEQItems used to search and the columns you're updating.
;WITH AQ AS
(
SELECT *, rn = ROW_NUMBER() OVER
(PARTITION BY TypeID3, TypeID4 ORDER BY ReqLevel1 DESC)
FROM _AEQItems
)
UPDATE I
SET RefItemID = AQ.ID,
MagParamNum = AQ.MagParamNum
FROM SRO_VT_SHARD.._Items I
JOIN SRO_VT_SHARD.._Inventory INV ON INV.ItemID = I.ID64
JOIN SRO_VT_SHARD.._RefObjCommon ROC ON ROC.ID = I.RefItemID
JOIN AQ ON AQ.TypeID3 = ROC.TypeID3 AND AQ.TypeID4 = ROC.TypeID4
WHERE AQ.rn = 1
AND INV.Slot BETWEEN 0 AND 13
AND INV.Slot!=8
AND AQ.ReqLevel1 <= #Data2
AND INV.CharID = #CharID;
Use a subquery:
UPDATE I
SET RefItemID=AQ.ID,I.MagParamNum=AQ.MagParamNum
FROM SRO_VT_SHARD.._Items I
JOIN SRO_VT_SHARD.._Inventory INV ON INV.ItemID=I.ID64
JOIN SRO_VT_SHARD.._RefObjCommon ROC ON ROC.ID=I.RefItemID
JOIN (SELECT TypeID3, TypeID4, MAX(ReqLevel1) AS ReqLevel1 FROM _AEQItems GROUP BY TypeID3, TypeID4) AQ
ON AQ.TypeID3=ROC.TypeID3
AND AQ.TypeID4=ROC.TypeID4
WHERE INV.Slot BETWEEN 0 AND 13 AND INV.Slot!=8 AND AQ.ReqLevel1<=#Data2 AND INV.CharID=#CharID

How to place subquery in SQL Server 2008

I have a main SQL query which returns stock quantity in dates between and I also need to return the stock as on date along with between dates result.
Main query:
SELECT
TT.TranRId, MAX(TT.TInQty) AS InQty, MAX(TT.TOutQty) AS OutQty
FROM
TReg TR, TTrans TT
WHERE
TR.TRegId = TT.TrRegId
AND TT.Stid = 2
AND TR.TransDate BETWEEN '2018-08-25' AND '2018-08-28'
GROUP BY
TT.TranRId
ORDER BY
TT.TranRId
Sub query:
(SELECT TT.TransRId, (SUM(TT.TInQty) - SUM(TT.TOutQty))
FROM TTrans TT, TReg TR
WHERE TR.TransDate <= '2018-08-24'
AND TR.TRegId = TT.TrRegId
AND TT.Stid = 2 GROUP BY TT.TranRId) --AS Stock
Please help where I should include my sub query in my main query
To get output as follows:
TransRId Stock InQty OutQty
----------------------------------
41 700 1 1000
42 800 5 500
I am not sure I am following your question 100%, if you are just looking to join it as a sub query the below logic should work.
SELECT TT.TranRId
,MAX(TT.TInQty) AS InQty
,MAX(TT.TOutQty) AS OutQty
,Stock.[Sum]
FROM TReg TR
LEFT JOIN TTrans TT
ON TR.TRegId = TT.TrRegId
LEFT JOIN (
SELECT TT.TransRId
,(SUM(TT.TInQty) - SUM(TT.TOutQty)) as Sum
FROM TTrans TT
LEFT JOIN TReg TR
ON TR.TRegId = TT.TrRegId
WHERE TR.TransDate <= '2018-08-24'
AND TT.Stid = 2
GROUP BY TT.TranRId
) AS Stock
ON Stock.TranRId = TT.TranRId
WHERE TT.Stid = 2
AND TR.TransDate BETWEEN '2018-08-25' AND '2018-08-28'
GROUP BY TT.TranRId
ORDER BY TT.TranRId
edit:
I noticed tt.TranRId and tt.Tran***s***RId if this is not a typo it would need to be corrected, if not my answer will not work for you.
If you need specific dates, including the date in the join logic along with the ID would give you the appropriate result...without knowing your data set I am not sure if the TranRId is unique...sorry about that!

SQL mutliple count

Could you explain how I get this particular table result?
My 4 queries to individually get each column separately are also below.
I am not sure on method here do I nest the last 3 queries into the first or do I use a union between the queries.
Bearing in mind that the information in each one doesn't really match I assume Union or Union All isn't going to be useful.
Would a derived table be a better method. Sorry my SQL skills are fairly basic.
I need to also retain the ability to 'tweak' the where clauses as my admin decides to exclude certain records later (you IT folks will be used to that!)
Some the ability to alter the where clauses would be good in a solution.
Just to make it more annoying for ya ;-)
Query table would need to look a little like this
Company Department Total_B Total_R Total_Ret RushJobs
ACME LSD 2 100 24 3
The four queries (that work separately to get each column above are here ( I have left in the respective Group By and where clauses incidentally I_Department does map to just Department in the case of 2nd query.
-- Total B count query from B
Select Company,Department, count(*) as Total_B from B
Group by Company,Department
Order BY Company;
--Select h count from h table
Select count(*) as Total_R, I_Department from H
where L ='re-box'
Group By IDepartment
-- Select r count
Select Company,Department,Count (B_Number) AS Total_Ret
from P Inner Join B ON P.Record_Number = B.B_Number
where P.Request_Date > = 'SOMEDATE' and P.Request_Date < = 'SOMEDATERANGE'
Group By Company,Department
-- Select Rush Jobs
Select Company,Department,Count (*) as RushJobs
from Res
Inner Join B on Res.Item_Number = B.B_Number
where Res.Setup_Date >= 'Somedate' and Res.Setup_Date<= 'somedaterange'
and Res.Res_Priority = '1'
Group By Company,Department
So final table
<table><TBODY>
<TR>
<TH>Company</TH>
<TH>Department</TH>
<TH>Total_B</TH>
<TH>Total_R </TH>
<TH>Total_Ret</TH>
<TH>RushJobs</TH></TR>
<TR>
<TD>ACME</TD>
<TD>LSD</TD>
<TD>100</TD>
<TD>2</TD>
<TD>4</TD>
<TD>1</TD></TR></TBODY></table>
One approach would be to use a Common table expression (CTE) aka with statement..
This allows each query to continue to be independent allowing you to easily twerk (I was going to correct that typo but it was just too funny) the where clauses for each and combines the results in the end returning 1 record with 4 columns.
-- Total B count query from B
With B as (
Select Company,Department, count(*) as Total_B from B
Group by Company,Department
Order BY Company),
H as (
--Select h count from h table
Select count(*) as Total_R, I_Department from H
where L ='re-box'
Group By IDepartment),
R as (-- Select r count
Select Company,Department,Count (B_Number) AS Total_Ret
from P Inner Join B ON P.Record_Number = B.B_Number
where P.Request_Date > = 'SOMEDATE' and P.Request_Date < = 'SOMEDATERANGE'
Group By Company,Department),
RushJobs as (-- Select Rush Jobs
Select Company,Department,Count (*) as RushJobs
from Res
Inner Join B on Res.Item_Number = B.B_Number
where Res.Setup_Date >= 'Somedate' and Res.Setup_Date<= 'somedaterange'
and Res.Res_Priority = '1'
Group By Company,Department)
SELECT coalesce(B.Company, R.Company, RJ.Company)
, coalesce(B.Department,R.Department, Rj.Department)
, B.Total_B, H.Total_R, R.Total_Ret, RJ.RushJobs
FROM
FULL OUTER JOIN H
on B.Company = H.Company
FULL OUTER JOIN R
on B.company = R.Company
and B.Department = R.Department
FULL OUTER JOIN RushJobs RJ
on H.company = RJ.Company
and H.Department = RJ.Department

How to display only the MAX results from a query

I am new to writing MS SQL queries and I am trying to display only the record with the highest field named RecordVersion.
Below is the query that works but displays all records:
SELECT
PriceCalendars.PriceProgramID,
PriceCalendars.EffectiveDateTime,
PriceSchedules.Price,
PriceSchedules.PLU,
items.Descr,
PriceSchedules.LastUpdate,
PriceSchedules.LastUpdatedBy,
PriceSchedules.RecordVersion,
PriceSchedules.PriceScheduleUniqueID
FROM
PriceCalendars
INNER JOIN PriceSchedules ON PriceCalendars.PriceProgramID = PriceSchedules.PriceProgramID
INNER JOIN items ON PriceSchedules.PLU = items.PLU
WHERE
(PriceSchedules.PLU = 'SLS10100103')
AND (PriceCalendars.EffectiveDateTime = '2016-03-22')
Here are the query results:
PriceProgramID EffectiveDateTime Price PLU Descr LastUpdate LastUpdatedBy RecordVersion PriceScheduleUniqueID
1 2016-03-22 00:00:00.000 35.00 SLS10100103 Architecture Adult from NP POS 2015-01-22 07:53:15.000 GX70,83 9 569
1 2016-03-22 00:00:00.000 32.00 SLS10100103 Architecture Adult from NP POS 2014-02-25 16:22:46.000 GX70,83 5 86180
The first line of the results has RecordVersion being 9 and the second line results is 5, I only want the higher record displaying, the one that returned RecordVersion = 9.
Every time I try to use the MAX command I get errors or the group by and I have tried every example I could find on the web but nothing seems to work.
Using MS SQL 2012.
Thanks,
Ken
Try the following query which attempts to solve your problem by ordering the returned rows by RecordVersion DESC and then SELECTs just the first row.
SELECT TOP 1
PriceCalendars.PriceProgramID,
PriceCalendars.EffectiveDateTime,
PriceSchedules.Price,
PriceSchedules.PLU,
items.Descr,
PriceSchedules.LastUpdate,
PriceSchedules.LastUpdatedBy,
PriceSchedules.RecordVersion,
PriceSchedules.PriceScheduleUniqueID
FROM
PriceCalendars
INNER JOIN PriceSchedules ON PriceCalendars.PriceProgramID = PriceSchedules.PriceProgramID
INNER JOIN items ON PriceSchedules.PLU = items.PLU
WHERE
(PriceSchedules.PLU = 'SLS10100103')
AND (PriceCalendars.EffectiveDateTime = '2016-03-22')
ORDER BY
RecordVersion DESC
All group by columns should be in select ,that's the rule of group by.How group by works is for every distinct combination of group by columns,arrange remaining columns into groups,so that any aggregation can be applied,in your case I am not sure what group by columns are unique with out test date.here is one version which use row number which gives you the output desired
Remember ,order by last updated date is the one which decides rows order and assign numbers
WITH CTE
AS
(
SELECT PriceCalendars.PriceProgramID,
PriceCalendars.EffectiveDateTime,
PriceSchedules.Price,
PriceSchedules.PLU,
items.Descr,
PriceSchedules.LastUpdate,
PriceSchedules.LastUpdatedBy,
PriceSchedules.RecordVersion,
PriceSchedules.PriceScheduleUniqueID,
ROW_NUMBER() OVER (PARTITION BY PriceSchedules.RecordVersion ORDER BY PriceSchedules.LastUpdatedBy) AS RN
FROM
PriceCalendars
INNER JOIN PriceSchedules ON PriceCalendars.PriceProgramID = PriceSchedules.PriceProgramID
INNER JOIN items ON PriceSchedules.PLU = items.PLU
WHERE
(PriceSchedules.PLU = 'SLS10100103')
AND (PriceCalendars.EffectiveDateTime = '2016-03-22')
)
SELECT * FROM CTE WHERE RN=1

TSQL - Return recent date

Having issues getting a dataset to return with one date per client in the query.
Requirements:
Must have the recent date of transaction per client list for user
Will need have the capability to run through EXEC
Current Query:
SELECT
c.client_uno
, c.client_code
, c.client_name
, c.open_date
into #AttyClnt
from hbm_client c
join hbm_persnl p on c.resp_empl_uno = p.empl_uno
where p.login = #login
and c.status_code = 'C'
select
ba.payr_client_uno as client_uno
, max(ba.tran_date) as tran_date
from blt_bill_amt ba
left outer join #AttyClnt ac on ba.payr_client_uno = ac.client_uno
where ba.tran_type IN ('RA', 'CR')
group by ba.payr_client_uno
Currently, this query will produce at least 1 row per client with a date, the problem is that there are some clients that will have between 2 and 10 dates associated with them bloating the return table to about 30,000 row instead of an idealistic 246 rows or less.
When i try doing max(tran_uno) to get the most recent transaction number, i get the same result, some have 1 value and others have multiple values.
The bigger picture has 4 other queries being performed doing other parts, i have only included the parts that pertain to the question.
Edit (2011-10-14 # 1:45PM):
select
ba.payr_client_uno as client_uno
, max(ba.row_uno) as row_uno
into #Bills
from blt_bill_amt ba
inner join hbm_matter m on ba.matter_uno = m.matter_uno
inner join hbm_client c on m.client_uno = c.client_uno
inner join hbm_persnl p on c.resp_empl_uno = p.empl_uno
where p.login = #login
and c.status_code = 'C'
and ba.tran_type in ('CR', 'RA')
group by ba.payr_client_uno
order by ba.payr_client_uno
--Obtain list of Transaction Date and Amount for the Transaction
select
b.client_uno
, ba.tran_date
, ba.tc_total_amt
from blt_bill_amt ba
inner join #Bills b on ba.row_uno = b.row_uno
Not quite sure what was going on but seems the Temp Tables were not acting right at all. Ideally i would have 246 rows of data, but with the previous query syntax it would produce from 400-5000 rows of data, obviously duplications on data.
I think you can use ranking to achieve what you want:
WITH ranked AS (
SELECT
client_uno = ba.payr_client_uno,
ba.tran_date,
be.tc_total_amt,
rnk = ROW_NUMBER() OVER (
PARTITION BY ba.payr_client_uno
ORDER BY ba.tran_uno DESC
)
FROM blt_bill_amt ba
INNER JOIN hbm_matter m ON ba.matter_uno = m.matter_uno
INNER JOIN hbm_client c ON m.client_uno = c.client_uno
INNER JOIN hbm_persnl p ON c.resp_empl_uno = p.empl_uno
WHERE p.login = #login
AND c.status_code = 'C'
AND ba.tran_type IN ('CR', 'RA')
)
SELECT
client_uno,
tran_date,
tc_total_amt
FROM ranked
WHERE rnk = 1
ORDER BY client_uno
Useful reading:
Ranking Functions (Transact-SQL)
ROW_NUMBER (Transact-SQL)
WITH common_table_expression (Transact-SQL)
Using Common Table Expressions

Resources