I have one source table in DB. I need to do group and sum to get one bridging table, extract supplier info on the other bridging table then join the two using part_number.
If I run the subqueries separately, T1 gives me 54699 records and T2 gives approx 10 times rows of T1.
Next, I do left join, I expect it should return 54699 records, but the server engine never stops and it returns 50 million records at the time I scroll down to the end. I have to stop the query manually. I realized there must something wrong with my query, but I can not figure it out. I would appreciate it if you have any ideas. Thank you!
SELECT
T1.*, T2.SUPPLIER
FROM
(SELECT
T.PART_NUMBER,T.YEAR, T.WEEK,
SUM(T.QTY_FILLED) TOTAL_FILLED,
SUM(T.QTY_ORDERED) TOTAL_ORDERED,
COUNT(T.LINE_NUMBER) ORDER_TIMES
FROM
DBO.TABLE1 T
WHERE
T.YEAR IS NOT NULL
GROUP BY
PART_NUMBER, T.YEAR, T.WEEK) T1
LEFT JOIN
(SELECT
T.PART_NUMBER, T.SUPPLIER
FROM
DBO.TABLE1 T) T2 ON T1.PART_NUMBER = T2.PART_NUMBER
ORDER BY
T1.PART_NUMBER, T1.YEAR, T1.WEEK
I also tried the window function, but still no luck.
WITH T1 AS
(
SELECT
T.PART_NUMBER,T.YEAR, T.WEEK,
SUM(T.QTY_FILLED) TOTAL_FILLED,
SUM(T.QTY_ORDERED) TOTAL_ORDERED,
COUNT(T.LINE_NUMBER) ORDER_TIMES
FROM
DBO.TABLE1 T
WHERE
T.YEAR IS NOT NULL
GROUP BY
PART_NUMBER, T.YEAR, T.WEEK
), T2 AS
(
SELECT T.PART_NUMBER, T.SUPPLIER
FROM DBO.TABLE1 T
)
SELECT
T1.*, T2.SUPPLIER
FROM
T1
LEFT JOIN
T2 ON T1.PART_NUMBER = T2.PART_NUMBER
ORDER BY
T1.PART_NUMBER, T1.YEAR, T1.WEEK
First of all, it not only return 54699 rows. You do a join without distinct, so the result could be the join of 50.000 x 5.000.000 rows and it depends on the value of your table.
If you use SQL 2017 or newer, try something like this:
SELECT
T.PART_NUMBER,T.YEAR, T.WEEK,
SUM(T.QTY_FILLED) TOTAL_FILLED,
SUM(T.QTY_ORDERED) TOTAL_ORDERED,
COUNT(T.LINE_NUMBER) ORDER_TIMES,
STRING_AGG (SUPPLIER, ', ') AS SUPPLIER
FROM
DBO.TABLE1 T
WHERE
T.YEAR IS NOT NULL
GROUP BY
PART_NUMBER, T.YEAR, T.WEEK
Related
I have a problem with my query. I have a simple example here that illustrates the code I have.
SELECT distinct ID
FROM Table
WHERE IteamNumber in (132,434,675) AND Year(DateCreated) = 2019
AND ID NOT IN (
SELECT Distinct ID FROM Table
WHERE IteamNumber in (132,434,675) AND DateCreated < '2019-01-01')
As you can see, I'm retrieving unique data id's that has been created in 2019 and not earlier.
The select statements works fine, but once I use the NOT IN statement, the query could easily go 1 minute plus.
My other question could this be related to the computer/server performance that is running the SQL Server for Microsoft Business Central? Because the same query worked perfectly after all even with the (NOT IN) statement, but that was in Microsoft dynamics C5 SQL Server.
So my question is there something wrong with my query or is it mainly a server issue?
UPDATE: here is a real example: this takes 25 seconds to retrieve 500 rows
Select count(distinct b.No_),'2014'
from [Line] c
inner join [Header] a
on a.CollectionNo = c.CollectionNo
Inner join [Customer] b
on b.No_ = a.CustomerNo
where c.No_ in('2101','2102','2103','2104','2105')
and year(Enrollmentdate)= 2014
and(a.Resignationdate < '1754-01-01 00:00:00.000' OR a.Resignationdate >= '2014-12-31')
and NOT EXISTS(Select distinct x.No_
from [Line] c
inner join [Header] a
on a.CollectionNo = c.CollectionNo
Inner join [Customer] x
on x.No_ = a.CustomerNo
where x.No_ = b.No_ and
c.No_ in('2101','2102','2103','2104','2105')
and Enrollmentdate < '2014-01-01'
and(a.Resignationdate < '1754-01-01 00:00:00.000' OR a.Resignationdate > '2014-12-31'))
If I understand correctly you can write the query as a GROUP BY query with a HAVING clause:
SELECT ID
FROM t
WHERE IteamNumber in (132, 434, 675)
GROUP BY ID
HAVING MIN(DateCreated) >= '20190101' -- no row earlier than 2019
AND MIN(DateCreated) < '20200101' -- at least one row less than 2020
This will remove rows for which an earlier record exists. You can further improve the performance by creating a covering index:
CREATE INDEX IX_t_0001 ON t (ID) INCLUDE (IteamNumber, DateCreated)
I usually prefer JOINs than INs, you can get the same result but the engine tends be able to optimize it better.
You join your main query (T1) with what was the IN subquery (T2), and you filter that T2.ID is null, ensuring that you haven't found any record matching those conditions.
SELECT distinct T1.ID
FROM Table T1
LEFT JOIN Table T2 on T2.ID = T1.ID AND
T2.IteamNumber in (132,434,675) AND T2.DateCreated < '2019-01-01'
WHERE T1.IteamNumber in (132,434,675) AND Year(T1.DateCreated) = 2019 AND
T2.ID is null
UPDATE: Here is the proposal updated with your real query. Since your subquery has inner joins, I have created a CTE so you can left join that subquery. The functioning is the same, you left join your main query with the subquery and you return only the rows with no matching records found on the subquery.
with previous as (
Select x.No_
from [Line] c
inner join [Header] a on a.CollectionNo = c.CollectionNo
inner join [Customer] x on x.No_ = a.CustomerNo
where c.No_ in ('2101','2102','2103','2104','2105')
and Enrollmentdate < '2014-01-01'
and (a.Resignationdate < '1754-01-01 00:00:00.000' OR a.Resignationdate > '2014-12-31'))
)
Select count(distinct b.No_),'2014'
from [Line] c
inner join [Header] a on a.CollectionNo = c.CollectionNo
inner join [Customer] b on b.No_ = a.CustomerNo
left join previous p on p.No_ = b.No_
where c.No_ in ('2101','2102','2103','2104','2105')
and year(Enrollmentdate)= 2014
and (a.Resignationdate < '1754-01-01 00:00:00.000' OR a.Resignationdate >= '2014-12-31')
and p.No_ is null
Issue is because of your IN statement, it is preferred in my opinion to avoid any IN statement rather then this, create join with subquery and filter out your data using where clause.
In case of IN statement each record of your table mapped with all the records of subquery, which definitely slows down your process.
If it is mandatory to use IN clause then use it with index. Create proper index of your respected columns, which improve your performance.
Instead of IN you may use EXISTS to increase the performance of your query.
Example of EXISTS is :
SELECT distinct ID
FROM Table AS T
WHERE IteamNumber in (132,434,675) AND Year(DateCreated) = 2019
AND NOT EXISTS (
SELECT Distinct ID FROM Table AS T2
WHERE T1.ID=T2.ID
AND IteamNumber in (132,434,675) AND DateCreated < '2019-01-01' )
I am using a complex query. I need to returns me always a row even it doesnt find anything.
SELECT
a.InventoryItemID,
a.Name,
a.RetailPrice,
b.MainGroupItemCode,
b.MainGroupItemID,
c.VatValue,
a.Code,
a.Weight,
b.MainGroupItemName,
a.RetailPrice2,
a.FreePrice,
case when isnull(e.IsActive,0)=1 and isnull(d.price,0)!=0 then d.Price else RetailPrice End as CustomPrice
from InventoryMaster a
join InventoryMainGroupItems b on a.MainGroupItemID=b.MainGroupItemID
join VatCodes c on b.VatCodeID=c.VatCodeID
join InventoryPrices d on d.InventoryItemID=a.InventoryItemID
join InventoryCatalog e on e.CatalogID=d.CatalogID
where a.InventoryItemID=2 and ISNULL(e.catalogID,1)=3
The problem is in last line ISNULL(e.catalogID,1)=3. In my table it doesn't exist CatalogID with number 3.
So it doesnt returns me anything, but there is CatalogID with number 1. I have set that if is null to return me 1, Unfortunately i dont get any row back from my query. How can i fix this ?
My question has been solved i just want to add one more join table with one wheere condition isnide
SELECT *
from
(
SELECT t1.ID,
t1.Name,
COALESCE(t2.price,t1.Price) AS price ,
Row_number() OVER(partition BY t1.ID ORDER BY t1.ID) rn
FROM InventoryMaster t1
LEFT JOIN inventoryprices t2
ON t1.ID=t2.ID
LEFT join InventoryCatalog t3
ON t3.ID=t2.ID and t3.ID=2
where t1.ID=2
) t
WHERE t.rn=1
it returns me always the retailprice from First Table Inventory
Add 3 cols near beginning of the select
d.InventoryItemID,
d.CatalogID,
e.CatalogID
Then remove And ISNULL from the Where, and run to see what you get.
It may be that
join InventoryCatalog e
needs
ON d.CatalogID=ISNULL(e.catalogID,1)
I have the following code:
IF (OBJECT_ID('tempdb..#Data') IS NOT NULL)
BEGIN
DROP TABLE #Data
END
SELECT
t.Name, x.Time, x.Date, x.Total,
xo.DrvCommTotal, x.Name2, x.Street, x.Zip,
r.Route1
INTO
#Data
FROM
table1 xo WITH(NOLOCK)
LEFT JOIN
Table2 t WITH(NOLOCK) ON t.ID = x.ID
LEFT JOIN
Route1 r ON r.RouteID = x.RouteID
WHERE
x.Client = 1
AND x.Date = '9/13/2018'
GROUP BY
t.Name, x.Time, x.Date, x.Total, xo.DrvCommTotal, x.Name2,
x.Street, x.Zip, r.Route1
ORDER BY
Route1
SELECT DISTINCT
F.*, F2.NumOrders
FROM
#Data F
LEFT JOIN
(SELECT
Route1, COUNT(*) NumOrders
FROM
#Data
GROUP BY
Route1) F2 ON F2.Route1 = F.Route1
LEFT OUTER JOIN
(SELECT
Street + ',' + Zip Stops, Time, RouteN1
FROM
#Data
GROUP BY
RouteNo1, street, Zip) F3 ON F3.Route1 = F.Route1
WHERE
F.Route1 IS NOT NULL
ORDER BY
F.Route1
and it provides me with a list of routes and stops. The column NumOrders lets me know how many orders are on each route. I need the stops to become individual columns I will label Stop1, Stop2, etc. so that each route is only one row and all the information is contained on the row for one route.
I'm currently using the temp table because the data is so large. I can play with my SELECT statement without having to re-run the entire code.
How do I move the stops for each route into columns?
Hum.. Not quite sure I understand the question but it sounds that you want to pivot the data so that the routes break into columns. If so, I would use a sql Pivot. Here is an example from the documentation:
USE AdventureWorks2014;
GO
SELECT VendorID, [250] AS Emp1, [251] AS Emp2, [256] AS Emp3, [257] AS Emp4, [260] AS Emp5
FROM
(SELECT PurchaseOrderID, EmployeeID, VendorID
FROM Purchasing.PurchaseOrderHeader) p
PIVOT
(
COUNT (PurchaseOrderID)
FOR EmployeeID IN
( [250], [251], [256], [257], [260] )
) AS pvt
ORDER BY pvt.VendorID;
Also, here is the link to how to use pivot: https://learn.microsoft.com/en-us/sql/t-sql/queries/from-using-pivot-and-unpivot?view=sql-server-2017
Since you already have all the data in your temp table, you could pivot that on the way out.
SELECT DISTINCT(t1.Ticker),t2.SecurityID,t2.ClosePrice,t2.QuoteDateTime FROM [Hub].[SecurityMaster].[SecurityMasterDetails] as t1
INNER JOIN [Hub].[SecurityMaster].[SecurityPrices] as t2
ON t2.SecurityID =t1.SecurityID
WHERE t2.QuoteDateTime IN (SELECT max(QuoteDateTime) FROM [Hub].[SecurityMaster].[SecurityPrices]) AND t1.SecurityTypeName = 'REIT'
I get an output with no data. The subquery doesn't run along with the other filter in the WHERE clause. I am not sure what I am doing wrong. Can somebody please help!
If you are trying to get the lastest row from SecurityPrices for each Ticker, one option is to use cross apply():
select --distinct /* distinct not needed if `Ticker` is unique on `smd`
smd.Ticker
, sp.SecurityID
, sp.ClosePrice
, sp.QuoteDateTime
from [Hub].[SecurityMaster].[SecurityMasterDetails] as smd
cross apply (
select top 1
i.SecurityID
, i.ClosePrice
, i.QuoteDateTime
from [Hub].[SecurityMaster].[SecurityPrices] i
where i.SecurityID = smd.SecurityID
order by i.QuoteDateTime desc
) as sp
where SecurityTypeName = 'REIT' /* which table does this column belong to? */
I think your query would be
SELECT DISTINCT TOP 1 WITH TIES
t1.Ticker,
t2.SecurityID,
t2.ClosePrice,
t2.QuoteDateTime
FROM [Hub].[SecurityMaster].[SecurityMasterDetails] as t1
INNER JOIN [Hub].[SecurityMaster].[SecurityPrices] as t2 ON t2.SecurityID =t1.SecurityID
WHERE SecurityTypeName = 'REIT'
ORDER BY t2.QuoteDateTime DESC
You aren't getting results because the max(QuoteDateTime) record doesn't have SecurityTypeName = 'REIT'. I think you want the max(QuoteDateTime) for this SecurityTypeName, so this can be done with an INNER JOIN.
SELECT DISTINCT
(t1.Ticker),
t2.SecurityID,
t2.ClosePrice,
t2.QuoteDateTime
FROM [Hub].[SecurityMaster].[SecurityMasterDetails] as t1
INNER JOIN [Hub].[SecurityMaster].[SecurityPrices] as t2
ON t2.SecurityID =t1.SecurityID
INNER JOIN
(SELECT max(QuoteDateTime) DT FROM [Hub].[SecurityMaster].[SecurityPrices]) P on P.DT = t2.QuoteDateTime
WHERE SecurityTypeName = 'REIT'
EDIT
Your data doesn't have what you think it does, I suspect. Here is how you can check...
--Find the SecurityID that matches the max date
SELECT
SecurityID ,
max(QuoteDateTime) DT
FROM [Hub].[SecurityMaster].[SecurityPrices]
GROUP BY SecurityID
--I'm betting this ID isn't in your SecurityMasterDetails where the Type is REIT
SELECT DISTINCT
SecurityID
FROM SecurityMasterDetails
WHERE SecurityTypeName = 'REIT'
Since the SecurityID returned in the first query isn't in the second query result set, you are going to get NULL results.
I'm wondering if there's any way to optimize the following SELECT query. (Note: I typed this when writing my question for nonexistent tables and I might not have the correct syntax.)
The goal is, if Table2 contains any related rows I want to set the value of the third column to the number of related rows in Table2. Otherwise, if Table3 contains any related rows I want to set the column to the number of related rows in Table3. Otherwise, I want to set the column value to 0.
SELECT Id, Title,
CASE
WHEN EXISTS (SELECT * FROM Table2 t2 WHERE t2.RelatedId = Table1.Id) THEN
(SELECT COUNT(1) FROM Table2 t2 WHERE t2.RelatedId = Table1.Id)
WHEN EXISTS (SELECT * FROM Table3 t3 WHERE t3.RelatedId = Table1.Id) THEN
(SELECT COUNT(1) FROM Table3 t3 WHERE t3.RelatedId = Table1.Id)
ELSE 0
END AS RelatedCount
FROM Table1
I don't like the fact that I'm basically performing the same query twice (in two cases). Is there any way to do what I want while only performing the query once?
Note that this is part of a much larger query with multiple JOINs and UNIONs so it's not easy to take a completely different approach.
This query should perform much better. You are not just performing the same query twice; since they are correlated subqueries, they will run once per row.
SELECT Id, Title,
coalesce(t2.Count, t3.Count, 0) AS RelatedCount
FROM Table1 t
left outer join (
SELECT RelatedId, count(*) as Count
FROM Table2
group by RelatedId
) t2 on t1.Id = t2.RelatedId
left outer join (
SELECT RelatedId, count(*) as Count
FROM Table3
group by RelatedId
) t3 on t1.Id = t3.RelatedId