SQL Server : optimize the efficiency with many joins relationship - sql-server

I have SQL Server code which takes long time to run the result. In the past, it took 15 minutes. But recently, might as a result of accumulated sales data, it took 2 hours to get the result!!
Therefore, I would like to get some advice regarding how to optimize the code:
The code structure is simple: just to get the sales sum for different regions for different time periods and for each SKU. (I have deleted some code here is to find the different SKU for each materials without size).
Many thanks in advance for your help.
The main code structure is as below, since it is almost the same, so I just give the first 2 paragraphs as example:
SELECT SKU from [MATINFO]
-- Global Sales History Qty - All the years
LEFT JOIN
(
SELECT SKU,SUM([SALES Qty]) as [Global Sales History Qty - All the years]
from dbo.[SALES]
where [PO] IS NOT NULL
group by SKU
)histORy
on MATINFO.[SKU]=histORy.[SKU]
-- Global Sales History Qty - Past 2 years
LEFT JOIN
(
SELECT (
SELECT SKU,SUM([SALES Qty]) as [Global Sales History Qty - All the years]
from dbo.[SALES]
where [PO] IS NOT NULL
group by SKU
/* date range */
and ([ORDER DATE] = '2015.11' OR [ORDER DATE] = '2015.12' or [ORDER DATE] like '%2015%' OR [ORDER DATE] like '%2016%' )
group by SKU
)histORy2
on MATINFO.[SKU]=histORy2.[SKU]
--Global Sales History Qty - Past 1 years
......SIMILAR TO THE CODE STRUCTURE AS ABOVE

The most likely cause of the poor performance is using string for dates and possibly the lack if as adequate indexes.
like '%2015%'
Using double-ended wildcards with like results in full table scans so subqueries are scanning the whole table each time you serach for a different date range. Using temp tables will not solve the underlying issues.
[added later]
Another facet of your original query structure might reduce the number of scans you need of the data - by using "conditional aggregates"
e.g. here is a condensed version of your original query
SELECT
SKU
FROM [MATINFO]
-- Global Sales History Qty - All the years
LEFT JOIN (SELECT
SKU
, SUM([SALES Qty]) AS [Global Sales History Qty - All the years]
FROM dbo.[SALES]
WHERE [PO] IS NOT NULL
GROUP BY
SKU) histORy ON MATINFO.[SKU] = histORy.[SKU]
-- Global Sales History Qty - Past 2 years
LEFT JOIN (SELECT
SKU
, SUM([SALES Qty]) AS [Global Sales History Qty - Past 2 years]
FROM dbo.[SALES]
WHERE [PO] IS NOT NULL
/* date range */
AND [ORDER DATE] >= '20151101' AND [ORDER DATE] < '20161101'
GROUP BY
SKU) histORy2 ON MATINFO.[SKU] = histORy2.[SKU]
That requires a 2 complete passes of the data in dbo.[SALES], but if you were to use a case expression inside the SUM() function you need only one pass of the data (in this example)
SELECT
SKU
, SUM([SALES Qty]) AS [Qty_all_years]
, SUM(CASE
WHEN [ORDER DATE] >= '20151101' AND [ORDER DATE] < '20161101'
THEN [SALES Qty]
END) AS [Qty_past_2_years]
FROM dbo.[SALES]
WHERE [PO] IS NOT NULL
GROUP BY
SKU
I suspect you could apply this logic to most of the columns and substantially improve efficiency of the query when coupled with date columns and appropriate indexing.

Expansion on my comment. Note it is just a suggestion, no guarantee if will run faster.
Take the following derived table histORy:
SELECT SKU,SUM([SALES Qty]) AS [Global Sales History Qty - All the years]
FROM dbo.[SALES]
WHERE [PO] IS NOT NULL
GROUP BY SKU
Before you run your query, materialize the derived table in a temporary table:
SELECT SKU,SUM([SALES Qty]) AS [Global Sales History Qty - All the years]
INTO #histORy
FROM dbo.[SALES]
WHERE [PO] IS NOT NULL
GROUP BY SKU
Then use the temporary table in the query:
LEFT JOIN #histORy AS h ON MATINFO.[SKU]=h.[SKU]
In this case you may want to have a index on the SKU field, so you could create the temporary table yourself, slap an index on it, populate with INSERT INTO #history... SELECT ... etc.

Related

What is the most efficient way to correlate a subquery on a big table in T-SQL?

I have a table that contains Make, Model, Serial Number, and Invoice Date of machine sales, and I want to pair that up with a table that contains Make, Serial Number, Recorded Usage, Usage Units, and Record Date - except that the Usage/Record Table is HUUUUUUGE and may not have a record for every machine.
I've tried writing an OUTER JOIN, but there's too much data in the Usage/Records table to make this operate efficiently. And I tried to write a CROSS APPLY, but I must have screwed something up, because that didn't seem to work very effectively, either.
Example of files:
My Base Query:
Inv. Date Mk Model Serial
2019-03-29 AA 420D 0FDP09999
2019-03-21 AA A19B-SSL 0DX240481
Usage/Records Table:
Mk Serial Usage Units Record Date
AA 0FDP09999 2345.0 H 2019-03-27
AA 0FDP09999 2349.2 H 2019-03-28
AA 0FDP09999 2351.8 H 2019-03-29
AA 0DX240481 0.0 H 2019-03-21
AA 0DX240481 24.0 H 2019-03-22
The output should be:
Inv. Date Mk Model Serial Usage Units Record Date
2019-03-29 AA 420D 0FDP09999 2351.8 H 2019-03-29
2019-03-21 AA A19B-SSL 0DX240481 0.0 H 2019-03-21
... returning the Usage, Units, and Record Date of ONLY the most recent entry prior to the Invoice Date.
Any suggestions?
You can try a left join and row_number().
SELECT t1.[Inv. Date],
t1.[Mk],
t1.[Model],
t1.[Serial],
t2.[Usage],
t2.[Units],
t2.[Record Date]
FROM (SELECT t1.[Inv. Date],
t1.[Mk],
t1.[Model],
t1.[Serial],
t2.[Usage],
t2.[Units],
t2.[Record Date],
row_number() OVER (PARTITION BY t1.[Inv. Date]
ORDER BY t2.[Record Date] DESC) rn
FROM table1 t1
LEFT JOIN table2 t2
ON t2.[Mk] = t1.[Mk]
AND t2.[Serial] = t1.[Serial]
AND t2.[Record Date] <= t1.[Inv. Date]) x
WHERE x.rn = 1;
For performance try an index on ([Mk], [Serial], [Inv. Date]) for the first and ([Mk], [Serial], [Record Date]) for the second table. Or maybe try to switch the position of [Mk] and [Serial] if serials are more or less "unique" also over different makes.
To solve this, I ended up creating additional queries outside what was originally my base query.
On the first outer query, I did this ("Invoice Number" is an additional field I invoked to ensure a unique row-numbering, in case a machine was sold once, bought back, and then sold again within the time period):
CASE
WHEN Q1.[Usage] IS NULL
THEN 1
ELSE ROW_NUMBER() OVER (PARTITION BY Q1.[Serial Number], Q1.[Mk], Q1.[Invoice Number] ORDER BY Q1.[Record Date] DESC)
END AS [RowNum]
This ensures that every entry in the table has a sorting mechanism, even if there's no Usage measurement in the joined table.
Then, the next outer query only grabbed the rows with RowNum = 1.

Overstock qty query SQL server

This question is a little modified based on the recommendation from Eric:
What I want to achieve: In order to determine an appropriate obsolescence provision level I would need to calculate an overstock qty per material item and per warehouse (we have several warehouses). The overstock qty will then be the basis for obsolescence calculations.
In order to achieve this I need to compare the current stock level in our warehouses with the consumption of stock in the past 5 years. However first of all I would like to aggregate the stock qty for all of our warehouses (should be a separate column "StockAll" in output table). Important is that I donĀ“t want one unique entry per item code. E.g. item code ABC is on stock in warehouse1 (5pc) and warehouse2 (5Pc) then the new column "StockAll" should contain 10pc for item code ABC and should pop up twice in the output table namely for warehouse1 and warehouse2.
The overstock (should be a new column "OverstockAll" in output table) is the difference "Stock All" and SH2.BAAS_qty_sold (the qty coming from the union to be found in the code below). Last but not least I need to allocate the Overstock qty being shown in output column "OverstockAll" by using each warehouse share on the correspondent material item number. I.e. based on the example above for item ABC. Stock all shows 10 PC and assume that result of Overstock all is 6PC for ABC. Then I would like to have separate column "overstock Local" in the output table which shows for warehouse1 3pc as overstock and the same for warehouse2 (each warehouse has 5pc on stock for material ABC and hence each warehouse should get allocated 50% of the "overstockall, hence 6*0,5)
Is there anybody whith an idea how to achieve this?
select
bds.[Warehouse code]
,bds.[Item code]
,bds.[Free text 2]
,bds.[Current Stock]
,bds.[unit cost]
,bds.[unit cost currency]
,SH2.BAAS_qty_sold
,case
when bds.[Current Stock]-SH2.BAAS_qty_sold >0
then bds.[Current Stock]-SH2.BAAS_qty_sold
else 0
end as overstock
from [BAAS_PowerBI].[dbo].[BAAS_Daily_Stock] as bds
Left join
(select
sum(SH.Bill_qty) as BAAS_qty_sold
, SH.MM_Material
from
-- Union starts here
(SELECT
Bill_BillingDate
, MM_Material
, Bill_qty
FROM dbo.BAAS_Bill_done
UNION All
SELECT
Bill_BillingDate
, MM_Material
, Bill_qty
FROM [BAAS_PowerBI].dbo.[GreatPlains Sales History 2012-102017]) SH
where
sh.Bill_BillingDate > dateadd(year, -5, getdate())
group by
SH.MM_Material) SH2
on bds.[Item code]= SH2.[mm_material]
Thank you in advance
Br
c.
Is this what you're looking for? The CASE expression calculates a difference if the current stock is greater than the 5_yr_consumption. Otherwise it returns a 0.
Also, I added a table alias for readability, and changed the JOIN from a LEFT to an INNER, since you won't have any overstock on items you don't have sales records for.
SELECT
bds.[Warehouse code]
,bds.[Item code]
,bds.[Free text 2]
,bds.[Current Stock]
,bds.[unit cost]
,bds.[unit cost currency]
,SH2.BAAS_qty_sold AS 5_yr_consumption
,CASE
WHEN bds.[Current Stock] > SH2.BAAS_qty_sold
THEN bds.[Current Stock] - SH2.BAAS_qty_sold
ELSE 0
END AS Overstock
FROM
BAAS_PowerBI.dbo.BAAS_Daily_Stock AS bds
JOIN
(
SELECT
BAAS_qty_sold = SUM(SH.Bill_qty)
,SH.MM_Material
FROM
-- Union starts here
(
SELECT
Bill_BillingDate
,MM_Material
,Bill_qty
FROM
dbo.BAAS_Bill_done
UNION ALL
SELECT
Bill_BillingDate
,MM_Material
,Bill_qty
FROM
BAAS_PowerBI.dbo.[GreatPlains Sales History 2012-102017]
) AS SH
WHERE
SH.Bill_BillingDate > DATEADD(YEAR, -5, GETDATE())
GROUP BY
SH.MM_Material
) AS SH2
ON
bds.[Item code] = SH2.mm_material
If all the boss cares about is inventory that actually HAS an overstock, just add this:
WHERE
CASE
WHEN bds.[Current Stock] > SH2.BAAS_qty_sold
THEN bds.[Current Stock] - SH2.BAAS_qty_sold
ELSE 0
END > 0

T-SQL - Get last as-at date SUM(Quantity) was not negative

I am trying to find a way to get the last date by location and product a sum was positive. The only way i can think to do it is with a cursor, and if that's the case I may as well just do it in code. Before i go down that route, i was hoping someone may have a better idea?
Table:
Product, Date, Location, Quantity
The scenario is; I find the quantity by location and product at a particular date, if it is negative i need to get the sum and date when the group was last positive.
select
Product,
Location,
SUM(Quantity) Qty,
SUM(Value) Value
from
ProductTransactions PT
where
Date <= #AsAtDate
group by
Product,
Location
i am looking for the last date where the sum of the transactions previous to and including it are positive
Based on your revised question and your comment, here another solution I hope answers your question.
select Product, Location, max(Date) as Date
from (
select a.Product, a.Location, a.Date from ProductTransactions as a
join ProductTransactions as b
on a.Product = b.Product and a.Location = b.Location
where b.Date <= a.Date
group by a.Product, a.Location, a.Date
having sum(b.Value) >= 0
) as T
group by Product, Location
The subquery (table T) produces a list of {product, location, date} rows for which the sum of the values prior (and inclusive) is positive. From that set, we select the last date for each {product, location} pair.
This can be done in a set based way using windowed aggregates in order to construct the running total. Depending on the number of rows in the table this could be a bit slow but you can't really limit the time range going backwards as the last positive date is an unknown quantity.
I've used a CTE for convenience to construct the aggregated data set but converting that to a temp table should be faster. (CTEs get executed each time they are called whereas a temp table will only execute once.)
The basic theory is to construct the running totals for all of the previous days using the OVER clause to partition and order the SUM aggregates. This data set is then used and filtered to the expected date. When a row in that table has a quantity less than zero it is joined back to the aggregate data set for all previous days for that product and location where the quantity was greater than zero.
Since this may return multiple positive date rows the ROW_NUMBER() function is used to order the rows based on the date of the positive quantity day. This is done in descending order so that row number 1 is the most recent positive day. It isn't possible to use a simple MIN() here because the MIN([Date]) may not correspond to the MIN(Quantity).
WITH x AS (
SELECT [Date],
Product,
[Location],
SUM(Quantity) OVER (PARTITION BY Product, [Location] ORDER BY [Date] ASC) AS Quantity,
SUM([Value]) OVER(PARTITION BY Product, [Location] ORDER BY [Date] ASC) AS [Value]
FROM ProductTransactions
WHERE [Date] <= #AsAtDate
)
SELECT [Date], Product, [Location], Quantity, [Value], Positive_date, Positive_date_quantity
FROM (
SELECT x1.[Date], x1.Product, x1.[Location], x1.Quantity, x1.[Value],
x2.[Date] AS Positive_date, x2.[Quantity] AS Positive_date_quantity,
ROW_NUMBER() OVER (PARTITION BY x1.Product, x1.[Location] ORDER BY x2.[Date] DESC) AS Positive_date_row
FROM x AS x1
LEFT JOIN x AS x2 ON x1.Product=x2.Product AND x1.[Location]=x2.[Location]
AND x2.[Date]<x1.[Date] AND x1.Quantity<0 AND x2.Quantity>0
WHERE x1.[Date] = #AsAtDate
) AS y
WHERE Positive_date_row=1
Do you mean that you want to get the last date of positive quantity come to positive in group?
For example, If you are using SQL Server 2012+:
In following scenario, when the date going to 01/03/2017 the summary of quantity come to 1(-10+5+6).
Is it possible the quantity of following date come to negative again?
;WITH tb(Product, Location,[Date],Quantity) AS(
SELECT 'A','B',CONVERT(DATETIME,'01/01/2017'),-10 UNION ALL
SELECT 'A','B','01/02/2017',5 UNION ALL
SELECT 'A','B','01/03/2017',6 UNION ALL
SELECT 'A','B','01/04/2017',2
)
SELECT t.Product,t.Location,SUM(t.Quantity) AS Qty,MIN(CASE WHEN t.CurrentSum>0 THEN t.Date ELSE NULL END ) AS LastPositiveDate
FROM (
SELECT *,SUM(tb.Quantity)OVER(ORDER BY [Date]) AS CurrentSum FROM tb
) AS t GROUP BY t.Product,t.Location
Product Location Qty LastPositiveDate
------- -------- ----------- -----------------------
A B 3 2017-01-03 00:00:00.000

how to get multiple min values from two SQL tables?

I have two tables, a Members table and a Plan table. They are structured as follows.
member start_date Mplan Pplan version start_dt end_dt
John 20120701 johnplan johnplan 1 20120601 20130531
John 20130201 johnplan johnplan 2 20130601 20140531
John 20130901 johnplan
John 20131201 johnplan
I need to update the start_date on the Members table to be the minimum value present for that member but within the same Plan version.
Example:
20130201 would be changed to 20120701 and 20131201 would change to 20130901.
Code:
UPDATE Members
SET start_date =(
SELECT MIN(start_date) FROM Members a
LEFT JOIN Plan ON Mplan = Pplan AND
start_date BETWEEN start_dt AND end_dt
WHERE member=a.member
AND start_date BETWEEN start_dt AND end_dt
)
Unfortunately this sets every single start_date to 19900101 aka the lowest value in the entire table for that column.
First you need to get the minimum start date of each member for a specific plan. The following will provide you that.
select MIN(start_date) as min_date,a.member as member_name,a.Mplan as plan_name FROM Members a inner JOIN [plan] p ON a.Mplan = p.Pplan AND
start_date BETWEEN p.start_dt AND p.end_dt
group by a.member, a.Mplan
The result will be something like this.
min_date member_name plan_name
2012-07-01 00:00:00.000 John johnplan1
2013-09-01 00:00:00.000 John johnplan2
Use this to update each member's start date for a plan with the lowest start date of the respective plan.
update members
set start_date= tbl.min_date from
(SELECT MIN(start_date) as min_date,a.member as member_name,a.Mplan as plan_name FROM Members a
inner JOIN [plan] p ON a.Mplan = p.Pplan AND
start_date BETWEEN p.start_dt AND p.end_dt
group by a.member, a.Mplan) as tbl
where member=tbl.member_name and Mplan=tbl.plan_name
I created your 2 tables, members and plan, and tested this solution with sample data and it works. I hope it helps.
You really need to convert the dates to Datetime. You will have a greater precision, the possibility to store hours, days and minutes as well as access to date specific functions, international conversion and localization.
If your column is a Varchar(8), then it uses no less space than a Datetime column.
That said, what you are looking for is row_number().
Something like:
SELECT Member, MPlan, Start_Date, Row_Number() OVER (PARTITION BY Member, MPLan ORDER BY Start_Date) as Version
FROM Members
Could you try this ? I didn't test it.
With Member_start_dt as
(
select *, (select start_dt from Pplan where M.start_date <= start_dt AND M.start_date >= end_dt) as Pplan_date
from Members M
),
Member_by_plan as
(
select *, ROW_NUMBER () over (partition by Pplan_date order by start_date) num
from Member_start_dt
)
update M
Set M.start_date = MBP1.start_date
from Members M
inner join Member_by_plan MBP1 ON MBP1.member = M.Member AND num = 1
inner join Member_by_plan MBP2 ON MBP2.member = M.Member AND MBP2.Pplan_date = MBP1.Pplan_date AND MBP2.start_date = M.start_date

How to query a Master database using Inner Join links to 2 sub-databases that are identical to each other

I have an Inventory table containing Master file info and 2 Movement History tables (Current Year and Last Year).
I want to use a Query to extract Movements from (say) June LAST Year to March THIS Year in Code, Date sequence.
I am relatively new to SQL and have tried to use the following INNER JOIN structure to do this:
SELECT Code, Descrip, Category, MLast.Date, MLast.DocNo, MCurr.Date, MCurr.DocNo
FROM Stock AS S
INNER JOIN MoveTrnArc MLast ON MLast.Stockcode = S.Code
AND MLast.Date >='2011/06/01' AND MLast.Date <='2012/03/31'
INNER JOIN MoveTrn MCurr ON MCurr.Stockcode = S.Code
AND MCurr.Date >='2011/06/01' AND MCurr.Date <='2012/03/31'
ORDER BY S.Code
This creates a Query Table with the following column structure:
Code | Descrip | Category | Date | DocNo | Date | DocNo |
...where the data from the LAST Year table appears in the first Date/DocNo columns and the CURRENT Year data appears in the second Date/DocNo columns.
What must I do to the Query to have each Movement in its own row or is there a better, more efficient Query to achieve this?
Also, I need the Movements listed in Code followed by Date sequence.
use union all instead of joins
select s.Code , s.Descrip , s.Category , t.Date , t.DocNo
from
(
select Stockcode, Date, DocNo from MoveTrnArc
union all
select Stockcode, Date, DocNo from MoveTrn
) t join Stock s on s.Code = t.Stockcode
where t.Date >='2011/06/01' AND t.Date <='2012/03/31'
beside careful with comparing dates, if Date column is type datetime and includes time you have to change t.Date <='2012/03/31' into t.Date <'2012/04/01' to include all the rows from 31st of march,
as '2012/03/31' is casted as '2012/03/31 00:00:00.000'

Resources