Conditional subquery that returns a running total - sql-server

I am trying to run a subquery with a condition that returns a running total. However, I am receiving the following error:
Only one expression can be specified in the select list when the subquery is not introduced with EXISTS.
Is there any way this code can be salvaged? Please be aware this code is part of a larger script that executes perfectly. The reason I need to keep it in this format is because it is the "missing piece", for lack of a better word.
SELECT A.[WeekEnding],
(
SELECT SUM(A.[Weekly Sales Units]), A.[Description], A.[WeekEnding]
FROM [FACT_SALES_HISTORY] A
INNER JOIN [DIM_DATE] B
ON A.WeekEnding = B.[WeekEnding] WHERE B.[YA Latest 1 Week] = 1
GROUP BY A.[Description], A.[WeekEnding]
) AS 'YA Units'
FROM [FACT_SALES_HISTORY] A
LEFT JOIN [DIM_DATE] B
ON A.WeekEnding = B.[WeekEnding]
The output data, from the code, would look like the following:
[Weekly Sales Units]) A.[Description] A.[WeekEnding]
24 Item One 03-10-2010
55 Item Two 03-10-2010
79 Item One 03-10-2010
98 Item Five 03-10-2010
11 Item Five 03-10-2010

You can't select three different items in your subquery and then use an AS assignment. You could split that into two separate queries and then union them.
SELECT SUM(A.[Weekly Sales Units]), A.[Description], A.[WeekEnding]
FROM [FACT_SALES_HISTORY] A
INNER JOIN [DIM_DATE] B
ON A.WeekEnding = B.[WeekEnding] WHERE B.[YA Latest 1 Week] = 1
GROUP BY A.[Description], A.[WeekEnding]
UNION ALL -- This will only union distinct columns
SELECT A.[WeekEnding]...<Your other columns>
FROM [FACT_SALES_HISTORY] A

t looks like your subquery on its own would provide the sample data and the outer query is trying to sum that up by weekending. If that is the case, then the whole thing could be replaced with this:
SELECT A.[WeekEnding], SUM(A.[Weekly Sales Units]) [YA Units]
FROM [FACT_SALES_HISTORY] A
INNER JOIN [DIM_DATE] B
ON A.WeekEnding = B.[WeekEnding] WHERE B.[YA Latest 1 Week] = 1
GROUP BY A.[WeekEnding]

Related

Optimise query with count and order by function

I have a problem with the optimization of this query, I have 3 tables (Products = Catalogo.GTIN, Sales Header = TEDEF.Factura and Sales Detail = TEDEF.Farmacia).
The query tries to find the Mode of the column VPRODEXENIGV_FAR. This query without the ORDER BY executes in less than 3 seconds (the table of details has about 30 million rows).
But when I add the ORDER BY clause, the query now takes more than 30 minutes to run.
I want to know how can I optimize this query or the indexes that I need to optimize this.
SELECT *
FROM Catalogo.GTIN G
CROSS APPLY
(SELECT TOP 1
COUNT(FAR.VPRODEXENIGV_FAR) [ROW],
YEAR(FAC2.VFECEMI_FAC) [AÑO],
MONTH(FAC2.VFECEMI_FAC) [MES],
FAR.VCODPROD_FAR_003,
CASE WHEN FAR.VPRODEXENIGV_FAR = 'A' THEN 1 ELSE 0 END AfectoIGV
FROM
TEDEF.Factura FAC2
INNER JOIN
TEDEF.Farmacia FAR ON FAC2.VTDOCPAGO_FAC = FAR.VTDOCPAGO_FAC
AND FAC2.VNDOCPAGO_FAC = FAR.VNDOCPAGO_FAC
WHERE
G.CODIGO = FAR.VCODPROD_FAR_003
GROUP BY
YEAR(FAC2.VFECEMI_FAC),
MONTH(FAC2.VFECEMI_FAC),
FAR.VCODPROD_FAR_003,
FAR.VPRODEXENIGV_FAR
ORDER BY
1 DESC --- <----- THE PROBLEM IS HERE
) GG
Ouch! You have a hugely expensive dependent subquery. It's expensive because SELECT TOP(n) ... ORDER BY col DESC does a whole lot of work to create a result set only to discard all but one row. And, it's a dependent subquery so it's run for every row of Catalogo.GTIN .
It looks like you want to count the resultset rows in the most recent month and year for each Catalogo.GTIN row. So, let's try to refactor your query to do that.
We'll start with a subquery to grab the month-start date of the latest Factura row for each catalog entry.
SELECT CODIGO,
DATEFROMPARTS(YEAR(maxd), MONTH(maxd),1) maxmes
FROM (
SELECT MAX(FAC2.VFECEMI_FAC) maxd,
G.CODIGO
FROM Catalogo.GTIN G
JOIN TDEF.Farmacia FAR
ON G.CODIGO = FAR.VCODPROD_FAR_003
JOIN TEDEF.Factura FAC2
ON FAC2.VTDOCPAGO_FAC = FAR.VTDOCPAGO_FAC
AND FAC2.VNDOCPAGO_FAC = FAR.VNDOCPAGO_FAC
GROUP BY G.CODIGO
) maxd
It's wise to test this and make sure it works correctly and performs tolerably well. If you test it in SSMS, you can use "Show Actual Execution Plan" and see if it recommends an extra index. This subquery need only be run once, rather than once per G.CODIGO row.
Then we'll use it in your larger query.
SELECT G.*,
COUNT(FAR.VPRODEXENIGV_FAR) [ROW],
YEAR(FAC2.VFECEMI_FAC) [AÑO],
MONTH(FAC2.VFECEMI_FAC) [MES],
FAR.VCODPROD_FAR_003,
CASE WHEN FAR.VPRODEXENIGV_FAR = 'A' THEN 1 ELSE 0 END AfectoIGV
FROM Catalogo.GTIN G
JOIN (
SELECT CODIGO,
DATEFROMPARTS(YEAR(maxd), MONTH(maxd),1) maxmes
FROM (
SELECT MAX(FAC2.VFECEMI_FAC) maxd,
G.CODIGO
FROM Catalogo.GTIN G
JOIN TDEF.Farmacia FAR
ON G.CODIGO = FAR.VCODPROD_FAR_003
JOIN TEDEF.Factura FAC2
ON FAC2.VTDOCPAGO_FAC = FAR.VTDOCPAGO_FAC
AND FAC2.VNDOCPAGO_FAC = FAR.VNDOCPAGO_FAC
GROUP BY G.CODIGO
) maxd
) maxmes ON G.CODIGO = maxmes.CODIGO
JOIN TEDEF.Farmacia FAR
ON G.CODIGO = FAR.VCODPROD_FAR_003
JOIN TEDEF.Factura FAC2
ON FAC2.VTDOCPAGO_FAC = FAR.VTDOCPAGO_FAC
AND FAC2.VNDOCPAGO_FAC = FAR.VNDOCPAGO_FAC
AND FAC2.VFECEMI_FAC >= maxmes.maxmes
GROUP BY maxmes.maxmes,
G.CODIGO,
FAR.VCODPROD_FAR_003,
FAR.VPRODEXENIGV_FAR
Here is the tricky bit:
DATEFROMPARTS(YEAR(maxd), MONTH(maxd),1) maxmes turns any date maxd into the first day of that month.
And, FAC2.VFECEMI_FAC >= maxmes.maxmes filters out rows before the first day of that month (for that CODIGO). It does so in a sargable way: a way that can exploit an index on FAC2.VFECEMI_FAC.
That is an alternative way to do TOP(1) ORDER BY d DESC. And faster.
It's all about sets of rows. Especially when using GROUP BY, it's performance-helpful to limit the number of rows in each set.
Obviously I cannot debug this.
Is me again, Finally i resolve the problem of the optimization, now the query delay is about 20 sec (with the sort instruction and with the count in a table over 30 million rows) i hope this way can help others or could be optimice more by the community.
I resolve the problem applying the sort but with the Row_Number instruction, in that way the server take my index for the sort instruction and make the magic:
WITH x
AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY GG.COD, GG.[AÑO], GG.[MES] ORDER BY GG.[ROW] DESC) [ID]
FROM Catalogo.GTIN G
CROSS APPLY
(
SELECT COUNT(FAR.VPRODEXENIGV_FAR) [ROW]
, YEAR(FAC2.VFECEMI_FAC) [AÑO]
, MONTH(FAC2.VFECEMI_FAC) [MES]
, FAR.VCODPROD_FAR_003 [COD]
, CASE WHEN FAR.VPRODEXENIGV_FAR = 'A' THEN 1 ELSE 0 END AfectoIGV
FROM TEDEF.Factura FAC2
INNER JOIN TEDEF.Farmacia FAR
ON FAC2.VTDOCPAGO_FAC = FAR.VTDOCPAGO_FAC
AND FAC2.VNDOCPAGO_FAC = FAR.VNDOCPAGO_FAC
WHERE G.CODIGO = FAR.VCODPROD_FAR_003
GROUP BY YEAR(FAC2.VFECEMI_FAC)
, MONTH(FAC2.VFECEMI_FAC)
, FAR.VCODPROD_FAR_003
, FAR.VPRODEXENIGV_FAR
-- ORDER BY 1 DESC --- <---- this is the bad guy, please, don't do that xD
) GG
) SELECT *
FROM x WHERE ID = 1
In that way i can sort the Count instruction and calculate the Mode for the Column FAR.VPRODEXENIGV_FAR

How to use GROUPING function in SQL common table expression - CTE

I have the below T-SQL CTE code where i'm trying to do some row grouping on four columns i.e Product, ItemClassification, Name & Number.
;WITH CTE_FieldData
AS (
SELECT
CASE(GROUPING(M.CodeName))
WHEN 0 THEN M.CodeName
WHEN 1 THEN 'Total'
END AS Product,
CASE(GROUPING(KK.ItemClassification))
WHEN 0 THEN KK.[ItemClassification]
WHEN 1 THEN 'N/A'
END AS [ItemClassification],
CASE(GROUPING(C.[Name]))
WHEN 0 THEN ''
WHEN 1 THEN 'Category - '+ '('+ItemClassification+')'
END AS [Name],
CASE(GROUPING(PYO.Number))
WHEN 0 THEN PYO.Number
WHEN 1 THEN '0'
END AS [Number],
ISNULL(C.[Name],'') AS ItemCode,
MAX(ISNULL(PYO.Unit, '')) AS Unit,
MAX(ISNULL(BT.TypeName, '')) AS [Water Type],
MAX(ISNULL(PYO.OrderTime, '')) AS OrderTime,
MAX(ISNULL(BUA.Event, '')) AS Event,
MAX(ISNULL(PYO.Remarks, '')) AS Remarks,
GROUPING(M.CodeName) AS ProductGrouping,
GROUPING(KK.ItemClassification) AS CategoryGrouping,
GROUPING(C.[Name]) AS ItemGrouping
FROM CTable C INNER JOIN CTableProducts CM ON C.Id = CM.Id
INNER JOIN MyData R ON R.PId = CM.PId
INNER JOIN MyDataDetails PYO ON PYO.CId = C.CId AND PYO.ReportId = R.ReportId
INNER JOIN ItemCategory KK ON C.KId = KK.KId
INNER JOIN Product M ON R.ProductId = M.ProductId
INNER JOIN WaterType BT ON PYO.WId = BT.WId
INNER JOIN WaterUnit BUA ON PYO.WUId = BUA.WUId
WHERE R.ReportId = 4360
GROUP BY M.CodeName, KK.ItemClassification, C.Name, PYO.Number
WITH ROLLUP
)
SELECT
Product,
[Name] AS Category,
Number,
Unit as ItemCode,
[Water Type],
OrderTime,
[Event],
[Comment]
FROM CTE_FieldData
Below are the issues/problems with the data being returned by the script above and they are the ones i'm trying to fix.
At the end of each ItemClassification grouping, i extra record is being added yet it does not exist in the table. (See line number 4 & 10 in the sample query results screenshot attached).
I want the ItemClassification grouping in column 2 to be at the beginning of the group not at the end of the group.
That way, ItemClassification "Category- (One)" would be at line 1 not the current line 5.
Also ItemClassification "Category- (Two)" would be at line 5 not the current line 11
Where the "ItemClassification" is displaying i would like to have columns (Number, ItemCode, [Water Type], [OrderTime], [Event], [Comment]) display null.
In the attached sample query results screenshot, those would be rows 11 & 5
The last row (13) is also unwanted.
I'm trying to understand SQL CTE and the GROUPING function but i'm not getting things right.
It looks like this is mostly caused by WITH ROLLUP and GROUPING. ROLLUP allows you to make essentially a sum line for your groupings. When you have WITH ROLLUP, it will give you NULL values for all of your non-aggregated fields in your select statement. You use GROUPING() in conjunction with ROLLUP to then label those NULL's as 'Total' or '0' or 'Category' as your query does.
1) Caused by GROUPING and ROLLUP. Take away both and this should be resolved.
2) Not sure what determines your groups and what would be defined as beginning or end. Order BY should suffice
3) Use ISNULL or CASE WHEN. If the Item Classification has a non null or non blank value, NULL each field out.
4) Take off WITH ROLLUP.

How to display only the MAX results from a query

I am new to writing MS SQL queries and I am trying to display only the record with the highest field named RecordVersion.
Below is the query that works but displays all records:
SELECT
PriceCalendars.PriceProgramID,
PriceCalendars.EffectiveDateTime,
PriceSchedules.Price,
PriceSchedules.PLU,
items.Descr,
PriceSchedules.LastUpdate,
PriceSchedules.LastUpdatedBy,
PriceSchedules.RecordVersion,
PriceSchedules.PriceScheduleUniqueID
FROM
PriceCalendars
INNER JOIN PriceSchedules ON PriceCalendars.PriceProgramID = PriceSchedules.PriceProgramID
INNER JOIN items ON PriceSchedules.PLU = items.PLU
WHERE
(PriceSchedules.PLU = 'SLS10100103')
AND (PriceCalendars.EffectiveDateTime = '2016-03-22')
Here are the query results:
PriceProgramID EffectiveDateTime Price PLU Descr LastUpdate LastUpdatedBy RecordVersion PriceScheduleUniqueID
1 2016-03-22 00:00:00.000 35.00 SLS10100103 Architecture Adult from NP POS 2015-01-22 07:53:15.000 GX70,83 9 569
1 2016-03-22 00:00:00.000 32.00 SLS10100103 Architecture Adult from NP POS 2014-02-25 16:22:46.000 GX70,83 5 86180
The first line of the results has RecordVersion being 9 and the second line results is 5, I only want the higher record displaying, the one that returned RecordVersion = 9.
Every time I try to use the MAX command I get errors or the group by and I have tried every example I could find on the web but nothing seems to work.
Using MS SQL 2012.
Thanks,
Ken
Try the following query which attempts to solve your problem by ordering the returned rows by RecordVersion DESC and then SELECTs just the first row.
SELECT TOP 1
PriceCalendars.PriceProgramID,
PriceCalendars.EffectiveDateTime,
PriceSchedules.Price,
PriceSchedules.PLU,
items.Descr,
PriceSchedules.LastUpdate,
PriceSchedules.LastUpdatedBy,
PriceSchedules.RecordVersion,
PriceSchedules.PriceScheduleUniqueID
FROM
PriceCalendars
INNER JOIN PriceSchedules ON PriceCalendars.PriceProgramID = PriceSchedules.PriceProgramID
INNER JOIN items ON PriceSchedules.PLU = items.PLU
WHERE
(PriceSchedules.PLU = 'SLS10100103')
AND (PriceCalendars.EffectiveDateTime = '2016-03-22')
ORDER BY
RecordVersion DESC
All group by columns should be in select ,that's the rule of group by.How group by works is for every distinct combination of group by columns,arrange remaining columns into groups,so that any aggregation can be applied,in your case I am not sure what group by columns are unique with out test date.here is one version which use row number which gives you the output desired
Remember ,order by last updated date is the one which decides rows order and assign numbers
WITH CTE
AS
(
SELECT PriceCalendars.PriceProgramID,
PriceCalendars.EffectiveDateTime,
PriceSchedules.Price,
PriceSchedules.PLU,
items.Descr,
PriceSchedules.LastUpdate,
PriceSchedules.LastUpdatedBy,
PriceSchedules.RecordVersion,
PriceSchedules.PriceScheduleUniqueID,
ROW_NUMBER() OVER (PARTITION BY PriceSchedules.RecordVersion ORDER BY PriceSchedules.LastUpdatedBy) AS RN
FROM
PriceCalendars
INNER JOIN PriceSchedules ON PriceCalendars.PriceProgramID = PriceSchedules.PriceProgramID
INNER JOIN items ON PriceSchedules.PLU = items.PLU
WHERE
(PriceSchedules.PLU = 'SLS10100103')
AND (PriceCalendars.EffectiveDateTime = '2016-03-22')
)
SELECT * FROM CTE WHERE RN=1

Counting duplicate items in different order

Goal:
To know if we have purchased duplicate StockCodes or Stock Description more than once on difference purchase orders
So, if we purchase Part ABC on Purchase Order 1 and Purchase Order 2, it should return the result of
PurchaseOrders, Part#, Qty
Purchase Order1, Purchase Order2, ABC, 2
I just don't know how to pull the whole code together, more to the point, how do I know if it's occurred on more than 1 Purchase Order without scrolling through all the results , may also have to do with Multiple (Having Count) Statements as I only seem to be doing by StockCode
SELECT t1.PurchaseOrder,
t1.MStockCode,
Count(t1.MStockCode) AS SCCount,
t1.MStockDes,
Count(t1.MStockDes) AS DescCount
FROM PorMasterDetail t1
INNER JOIN PorMasterHdr t2
ON t1.PurchaseOrder = t2.PurchaseOrder
WHERE Year(t2.OrderEntryDate) = Year(Getdate())
AND Month(t2.OrderEntryDate) = Month(Getdate())
GROUP BY t1.PurchaseOrder,
t1.MStockCode,
t1.MStockDes
HAVING Count(t1.MStockCode) > 1
Using responses I came up with the following
select * from
(
SELECT COUNT(dbo.InvMaster.StockCode) AS Count, dbo.InvMaster.StockCode AS StockCodes,
dbo.PorMasterDetail.PurchaseOrder, dbo.PorMasterHdr.OrderEntryDate
FROM dbo.InvMaster INNER JOIN dbo.PorMasterDetail ON
dbo.InvMaster.StockCode = dbo.PorMasterDetail.MStockCode
INNER JOIN dbo.PorMasterHdr ON dbo.PorMasterDetail.PurchaseOrder = dbo.PorMasterHdr.PurchaseOrder
WHERE YEAR(dbo.PorMasterHdr.OrderEntryDate) = YEAR(GETDATE())
GROUP BY dbo.InvMaster.StockCode, dbo.InvMaster.StockCode,
dbo.PorMasterDetail.PurchaseOrder, dbo.PorMasterHdr.OrderEntryDate
) Count
Where Count.Count > 1
This returns the below , which is starting to be a bit more helpful
In result line 2,3,4 we can see the same stock code (*30044) ordered 3 times on different
purchase orders.
I guess the question is, is it possible to look at If something was ordered more than once within say a 30 day period.
Is this possible?
Count StockCodes PurchaseOrder OrderEntryDate
2 *12.0301.0021 322959 2014-09-08
2 *30044 320559 2014-01-21
8 *30044 321216 2014-03-26
4 *30044 321648 2014-05-08
5 *32317 321216 2014-03-26
4 *4F-130049/TEST 323353 2014-10-22
5 *650-1157/E 322112 2014-06-24
2 *650-1757 321226 2014-03-27
SELECT *
FROM
(
SELECT h.OrderEntryDate, d.*,
COUNT(*) OVER (PARTITION BY d.MStockCode) DupeCount
FROM
PorMasterHdr h
INNER JOIN PorMasterDetail d ON
d.PurchaseOrder = h.PurchaseOrder
WHERE
-- first day of current month
-- http://blog.sqlauthority.com/2007/05/13/sql-server-query-to-find-first-and-last-day-of-current-month/
h.OrderEntryDate >= CONVERT(VARCHAR(25), DATEADD(dd,-(DAY(GETDATE())-1),GETDATE()),101)
) dupes
WHERE
dupes.DupeCount > 1;
This should work if you're only deduping on stock code. I was a little unclear if you wanted to dedupe on both stock code and stock desc, or either stock code or stock desc.
Also I was unclear on your return columns because it almost looks like you're wanting to pivot the columns so that both purchase order numbers appear on the same line.

Using MIN() to get the lowest value, but i got two rows?

I'm using a SQL question were I want to find the lowest value from the field prod_week.
This is the query:
SELECT
MIN(oe.prod_week), oe.prodplan_id
FROM
pd_mounting_details as md
LEFT OUTER JOIN
pd_order_eco AS oe ON md.order_data = oe.id
LEFT OUTER JOIN
pd_article AS a ON md.article = a.id
WHERE
oe.status = 4
AND (md.starttime = '' OR md.starttime IS NULL)
AND (a.production_group = 4)
AND (NOT (oe.amount = 0))
GROUP BY
oe.prodplan_id
The result of this is
prod_week | prodplan_id
1126 | 27
1127 | 28
What I don't understand is why this result in two rows when I used MIN(prod_week) to get the row with the lowest week number.
If I remove the prodplan_id from the selection it all works and I get one row were prod_week is "1126". And from that all I want is to get the id prodplan_id to.
I hope this question isn't to blurry?
You are using GROUP BY, which means you will get one row per GROUP.
In this case your GROUP is prodplan_id and there are two matching values.
To get both values you can try:
SELECT oe.prod_week, oe.prodplan_id
FROM pd_mounting_details as md
LEFT OUTER JOIN pd_order_eco AS oe
ON md.order_data = oe.id
WHERE oe.prod_week = (SELECT MIN(oe.prod_week)
FROM pd_mounting_details as md
LEFT OUTER JOIN pd_order_eco AS oe
ON md.order_data = oe.id
LEFT OUTER JOIN pd_article AS a
ON md.article = a.id where oe.status=4
AND (md.starttime ='' or md.starttime is null)
AND (a.production_group = 4)
AND (NOT (oe.amount = 0)))
When you do
select min(x),y
from table
group by y;
what you're doing is getting y and the smallest value of x for each distinct value of y. So, since prodplan_id has values of 27 and 28 in your morass of joins, we have that the smallest value of prod_week that appears when prodplan_id=27 is 1126, and the smallest value of prod_week that appears when prodplan_id=28 is 1127.
ETA: If you want one row, you could do an order by 1 limit 1 at the end.
ETA^2: You can also wrap things up in a subquery and use a where clause at the end:
select min_prod_week,prodplan_id
from(
select min(oe.prod_week) as min_prod_week,oe.prodplan_id
from....
group by oe.prodplan_id
)min
where min_prod_week=(select min(prod_week) from pd_order_eco)
Since your select statement ends with a group by clause, you are selecting the minimum prod_week for each prodplan_id instead of the overall minimum. Remove the group by clause and it should work as you expect.

Resources