Left Outer Join and simple join within 3 tables - sql-server

I am new to queries so excuse my ignorance.
The temp table #all has three columns: cod, cust_name, end_date. This table has 2500 rows. When the below query is run, I am not getting all the codes but it gives me about 400 rows.
For the HCS_Dtl and HCIS_Hd table: there will always be one matching row in both
SELECT p.cod, count(d.FormNo), SUM(d.NetAmt)
FROM #all p left outer join HCS_Dtl d on p.cod=d.Code
join HCIS_Hd h on d.FormNo=h.FormNo
WHERE
h.TimeStmp between '2015-03-01 00:00:00' and '2015-03-28 23:59:59'
GROUP BY p.cod
I need that I get those 2500 rows from #all even they don't have a form during this time period that I am mentioning in where clause of the query. How could it be possible?
I am using SQL Server 2008 R2

Try using LEFT JOIN on table HCIS_Hd
SELECT p.cod, count(d.FormNo), SUM(d.NetAmt)
FROM #all p left outer join HCS_Dtl d on p.cod=d.Code
LEFT JOIN HCIS_Hd h on d.FormNo=h.FormNo
WHERE
h.TimeStmp between '2015-03-01 00:00:00' and '2015-03-28 23:59:59'
GROUP BY p.cod

Related

SQL Server Never Ending when Join Two Tables

I have one source table in DB. I need to do group and sum to get one bridging table, extract supplier info on the other bridging table then join the two using part_number.
If I run the subqueries separately, T1 gives me 54699 records and T2 gives approx 10 times rows of T1.
Next, I do left join, I expect it should return 54699 records, but the server engine never stops and it returns 50 million records at the time I scroll down to the end. I have to stop the query manually. I realized there must something wrong with my query, but I can not figure it out. I would appreciate it if you have any ideas. Thank you!
SELECT
T1.*, T2.SUPPLIER
FROM
(SELECT
T.PART_NUMBER,T.YEAR, T.WEEK,
SUM(T.QTY_FILLED) TOTAL_FILLED,
SUM(T.QTY_ORDERED) TOTAL_ORDERED,
COUNT(T.LINE_NUMBER) ORDER_TIMES
FROM
DBO.TABLE1 T
WHERE
T.YEAR IS NOT NULL
GROUP BY
PART_NUMBER, T.YEAR, T.WEEK) T1
LEFT JOIN
(SELECT
T.PART_NUMBER, T.SUPPLIER
FROM
DBO.TABLE1 T) T2 ON T1.PART_NUMBER = T2.PART_NUMBER
ORDER BY
T1.PART_NUMBER, T1.YEAR, T1.WEEK
I also tried the window function, but still no luck.
WITH T1 AS
(
SELECT
T.PART_NUMBER,T.YEAR, T.WEEK,
SUM(T.QTY_FILLED) TOTAL_FILLED,
SUM(T.QTY_ORDERED) TOTAL_ORDERED,
COUNT(T.LINE_NUMBER) ORDER_TIMES
FROM
DBO.TABLE1 T
WHERE
T.YEAR IS NOT NULL
GROUP BY
PART_NUMBER, T.YEAR, T.WEEK
), T2 AS
(
SELECT T.PART_NUMBER, T.SUPPLIER
FROM DBO.TABLE1 T
)
SELECT
T1.*, T2.SUPPLIER
FROM
T1
LEFT JOIN
T2 ON T1.PART_NUMBER = T2.PART_NUMBER
ORDER BY
T1.PART_NUMBER, T1.YEAR, T1.WEEK
First of all, it not only return 54699 rows. You do a join without distinct, so the result could be the join of 50.000 x 5.000.000 rows and it depends on the value of your table.
If you use SQL 2017 or newer, try something like this:
SELECT
T.PART_NUMBER,T.YEAR, T.WEEK,
SUM(T.QTY_FILLED) TOTAL_FILLED,
SUM(T.QTY_ORDERED) TOTAL_ORDERED,
COUNT(T.LINE_NUMBER) ORDER_TIMES,
STRING_AGG (SUPPLIER, ', ') AS SUPPLIER
FROM
DBO.TABLE1 T
WHERE
T.YEAR IS NOT NULL
GROUP BY
PART_NUMBER, T.YEAR, T.WEEK

Filter in Join instead of Where Clause

I'm adapting queries that hard code a filter of inclusion values in the join. When I Inner Join a table of values instead of hardcoding the values, I get more records back. Does anyone have any insight why?
The original query hardcodes the inclusion values in the inner join and it comes back with 1600 records
SELECT DISTINCT dat.HLV_ID, dat.CUR_VALUE_DATETIME, sde.CONCEPT_ID
FROM SMRTDTA_ELEM_DATA dat
INNER JOIN CLARITY_CONCEPT sde ON dat.ELEMENT_ID = sde.CONCEPT_ID
AND sde.CONCEPT_ID IN ('EPIC#....','EPIC#.....','EPIC#.....')
Inner Joining a list of inclusion values into a temp table brings back more results
IF OBJECT_ID('tempdb..#LOOKUPCODES') IS NOT NULL DROP TABLE #LOOKUPCODES;
SELECT sde.CONCEPT_ID code_id, 'clarity_eap.proc_id' as code_id_field
INTO #LOOKUPCODES
FROM CLARITY_CONCEPT sde
WHERE sde.CONCEPT_ID IN ('EPIC#....','EPIC#.....','EPIC#.....')
SELECT DISTINCT dat.HLV_ID, dat.CUR_VALUE_DATETIME , sde.CONCEPT_ID
FROM SMRTDTA_ELEM_DATA dat
INNER JOIN CLARITY_CONCEPT sde ON dat.ELEMENT_ID = sde.CONCEPT_ID
INNER JOIN #LOOKUPCODES l on l.code_id = sde.CONCEPT_ID

NOT IN statement is slowing down my query

I have a problem with my query. I have a simple example here that illustrates the code I have.
SELECT distinct ID
FROM Table
WHERE IteamNumber in (132,434,675) AND Year(DateCreated) = 2019
AND ID NOT IN (
SELECT Distinct ID FROM Table
WHERE IteamNumber in (132,434,675) AND DateCreated < '2019-01-01')
As you can see, I'm retrieving unique data id's that has been created in 2019 and not earlier.
The select statements works fine, but once I use the NOT IN statement, the query could easily go 1 minute plus.
My other question could this be related to the computer/server performance that is running the SQL Server for Microsoft Business Central? Because the same query worked perfectly after all even with the (NOT IN) statement, but that was in Microsoft dynamics C5 SQL Server.
So my question is there something wrong with my query or is it mainly a server issue?
UPDATE: here is a real example: this takes 25 seconds to retrieve 500 rows
Select count(distinct b.No_),'2014'
from [Line] c
inner join [Header] a
on a.CollectionNo = c.CollectionNo
Inner join [Customer] b
on b.No_ = a.CustomerNo
where c.No_ in('2101','2102','2103','2104','2105')
and year(Enrollmentdate)= 2014
and(a.Resignationdate < '1754-01-01 00:00:00.000' OR a.Resignationdate >= '2014-12-31')
and NOT EXISTS(Select distinct x.No_
from [Line] c
inner join [Header] a
on a.CollectionNo = c.CollectionNo
Inner join [Customer] x
on x.No_ = a.CustomerNo
where x.No_ = b.No_ and
c.No_ in('2101','2102','2103','2104','2105')
and Enrollmentdate < '2014-01-01'
and(a.Resignationdate < '1754-01-01 00:00:00.000' OR a.Resignationdate > '2014-12-31'))
If I understand correctly you can write the query as a GROUP BY query with a HAVING clause:
SELECT ID
FROM t
WHERE IteamNumber in (132, 434, 675)
GROUP BY ID
HAVING MIN(DateCreated) >= '20190101' -- no row earlier than 2019
AND MIN(DateCreated) < '20200101' -- at least one row less than 2020
This will remove rows for which an earlier record exists. You can further improve the performance by creating a covering index:
CREATE INDEX IX_t_0001 ON t (ID) INCLUDE (IteamNumber, DateCreated)
I usually prefer JOINs than INs, you can get the same result but the engine tends be able to optimize it better.
You join your main query (T1) with what was the IN subquery (T2), and you filter that T2.ID is null, ensuring that you haven't found any record matching those conditions.
SELECT distinct T1.ID
FROM Table T1
LEFT JOIN Table T2 on T2.ID = T1.ID AND
T2.IteamNumber in (132,434,675) AND T2.DateCreated < '2019-01-01'
WHERE T1.IteamNumber in (132,434,675) AND Year(T1.DateCreated) = 2019 AND
T2.ID is null
UPDATE: Here is the proposal updated with your real query. Since your subquery has inner joins, I have created a CTE so you can left join that subquery. The functioning is the same, you left join your main query with the subquery and you return only the rows with no matching records found on the subquery.
with previous as (
Select x.No_
from [Line] c
inner join [Header] a on a.CollectionNo = c.CollectionNo
inner join [Customer] x on x.No_ = a.CustomerNo
where c.No_ in ('2101','2102','2103','2104','2105')
and Enrollmentdate < '2014-01-01'
and (a.Resignationdate < '1754-01-01 00:00:00.000' OR a.Resignationdate > '2014-12-31'))
)
Select count(distinct b.No_),'2014'
from [Line] c
inner join [Header] a on a.CollectionNo = c.CollectionNo
inner join [Customer] b on b.No_ = a.CustomerNo
left join previous p on p.No_ = b.No_
where c.No_ in ('2101','2102','2103','2104','2105')
and year(Enrollmentdate)= 2014
and (a.Resignationdate < '1754-01-01 00:00:00.000' OR a.Resignationdate >= '2014-12-31')
and p.No_ is null
Issue is because of your IN statement, it is preferred in my opinion to avoid any IN statement rather then this, create join with subquery and filter out your data using where clause.
In case of IN statement each record of your table mapped with all the records of subquery, which definitely slows down your process.
If it is mandatory to use IN clause then use it with index. Create proper index of your respected columns, which improve your performance.
Instead of IN you may use EXISTS to increase the performance of your query.
Example of EXISTS is :
SELECT distinct ID
FROM Table AS T
WHERE IteamNumber in (132,434,675) AND Year(DateCreated) = 2019
AND NOT EXISTS (
SELECT Distinct ID FROM Table AS T2
WHERE T1.ID=T2.ID
AND IteamNumber in (132,434,675) AND DateCreated < '2019-01-01' )

Why Inner Join worked as Cross Join in SQL Server?

I am trying to join several tables using INNER JOIN.
Here is code
IF OBJECT_ID('tempdb..#tmpRecData') IS NOT NULL
DROP TABLE #tmpRecData
--STEP 1
SELECT DISTINCT
pr.ChainID, pr.StoreID, pr.SupplierID, pr.ProductID,
MAX(CAST(pr.ActiveLastDate AS date)) AS 'Active Date'
--ChainID, SupplierID, StoreID, InvoiceDate, InvoiceNumber, SupplierInvoiceDate, SupplierInvoiceNumber
INTO
#tmpRecData
FROM
dbo.[ProductPrices_Retailer] AS pr
LEFT JOIN
ProductIdentifiers iden ON pr.ProductID = iden.ProductID
AND iden.ProductIdentifierTypeID = 2
WHERE
pr.ChainID = '119121'
AND pr.ActiveLastDate > '12/01/2016'
GROUP BY
pr.ProductID, pr.ProductName, iden.IdentifierValue,
pr.ChainID, pr.StoreID, pr.SupplierID
--STEP 2
SELECT
rec.ChainID, rec.StoreID, rec.SupplierInvoiceNumber,
rec.TransactionTypeID, rec.SupplierID, rec.SaleDateTime,
rec.ProductID, rec.UPC, rec.ProductDescriptionReported,
rec.RawProductIdentifier
FROM
#tmpRecData t
INNER JOIN
dbo.StoreTransactions AS rec WITH (NOLOCK) ON rec.ChainID = T.ChainID
WHERE
rec.ChainID = '119121'
DROP TABLE #tmpRecData
I am getting 4096 (Step1) * 145979 (Step2) = 725077693 rows (725 million)
This is a huge number of records, but I have used INNER JOIN, so why it worked as CROSS JOIN?
CROSS JOIN is very different to INNER JOIN.
INNER JOIN displays only the rows that have a match in both the joined tables..
CROSS JOIN produces a Cartesian product of the tables in the join. The number of rows of the result is the number of the rows in first table multiplied by the number of rows in the second table.
You need to join with store ID in step2 for this to work. It is running chainID for every store , hence too many number of records. If products also need to match, then you need to Join productID as well in step2
IF OBJECT_ID('tempdb..#tmpRecData') IS NOT NULL DROP TABLE #tmpRecData
--STEP 1
SELECT DISTINCT pr.ChainID,pr.StoreID,pr.SupplierID,pr.ProductID, MAX(CAST(pr.ActiveLastDate AS date)) AS 'Active Date'
--ChainID, SupplierID, StoreID, InvoiceDate, InvoiceNumber, SupplierInvoiceDate, SupplierInvoiceNumber
INTO #tmpRecData
FROM dbo.[ProductPrices_Retailer] AS pr
LEFT JOIN ProductIdentifiers iden
ON pr.ProductID=iden.ProductID
AND iden.ProductIdentifierTypeID=2
WHERE pr.ChainID='119121'
AND pr.ActiveLastDate>'12/01/2016'
GROUP BY pr.ProductID,pr.ProductName,iden.IdentifierValue,pr.ChainID,pr.StoreID,pr.SupplierID
--STEP 2
SELECT rec.ChainID,rec.StoreID,rec.SupplierInvoiceNumber,rec.TransactionTypeID,rec.SupplierID,rec.SaleDateTime,
rec.ProductID,rec.UPC,rec.ProductDescriptionReported,rec.RawProductIdentifier
FROM #tmpRecData t
INNER JOIN dbo.StoreTransactions AS rec WITH (NOLOCK)
ON rec.ChainID=T.ChainID and rec.StoreID = T.storeID
WHERE rec.ChainID='119121'
DROP TABLE #tmpRecData

SQL Server 2012 query date

How can I return values in a where clause for something like this:
Get me all the records that exist in table1.de1,table2.de2,table3.de3,table4.de3
select *
from table1
inner join table2
on table2.carID = table1.carID
inner join table3
on table3.carID = table1.carID
inner join table4
on table4.driverID = table1.driverID
where a recietrecord exists in table2 and its paydate has passed 20 days ago, comparing it to TODAYS date and show those days in a field called Days Passed From The Day Driver Was Suppose To Pay
At first, please, use table aliases. This solution is for SQL Server:
As was written in comments you can use DATEDIFF function to compare paydate to GETDATE.
select *,
DATEDIFF(day,t2.paydate,GETDATE()) as [Days Passed From The Day Driver Was Suppose To Pay]
from table1 t1
inner join table2 t2
on t2.carID = t1.carID
inner join table3 t3
on t3.carID = t1.carID
inner join table4 t4
on t4.driverID = t1.driverID
WHERE DATEDIFF(day,t2.paydate,GETDATE()) > 20
Or better use minutes:
DATEDIFF(minute,t2.paydate,GETDATE()) > 28800 --60 minutes * 24 hours * 20 days

Resources