SQL Server join query optimization with group by - sql-server

Just for knowledge, I want to know that, can the below given query be achieve by any other feasible way like using group by.
SELECT
GROUPMAS.GRPCODE, GROUPMAS.GRPNAME,
GRPDTLS.ACCODE, GRPDTLS.ACNAME, GRPDTLS.DOA "ADMISSION DATE",
LOANMAST.LOANCODE, LOANMAST.VCHDATE "LOAN SANCTION DATE",
LOANMAST.LANAMT,
(SELECT SUM(RECPDTLS.INSTAMT)
FROM RECPDTLS
WHERE LOANCODE = LOANMAST.LOANCODE
AND RECPDTLS.VCHDATE <= '2009-03-31') AS REPAYMENT,
(SELECT SUM(RECPDTLS.INTAMT)
FROM RECPDTLS
WHERE LOANCODE = LOANMAST.LOANCODE
AND RECPDTLS.VCHDATE <= '2009-03-31') AS INTREST,
(SELECT MAX(RECPDTLS.VCHDATE)
FROM RECPDTLS
WHERE LOANCODE = LOANMAST.LOANCODE
AND RECPDTLS.VCHDATE <= '2009-03-31') AS "LAST PAYMENT ON"
FROM
GROUPMAS
JOIN
GRPDTLS ON (GROUPMAS.GRPCODE = GRPDTLS.GRPCODE AND GRPDTLS.DOA <= '2009-03-31')
JOIN
LOANMAST ON (GRPDTLS.GRPCODE = LOANMAST.GRPCODE AND GRPDTLS.ACCODE = LOANMAST.ACCODE AND LOANMAST.VCHDATE <= '2009-03-31')
Table GROUPMAS structure
GRPCODE | GRPNAME
--------| -------
1 | A
2 | B
Table GRPDTLS structure
GRPCODE | ACCODE | ACNAME | DOA
--------|--------|--------|-----
1 | 1 | name1A | 2007-07-05
1 | 2 | name2A | 2008-07-05
2 | 1 | name1B | 2007-07-06
2 | 2 | name2B | 2007-07-05
Table LOANMAST structure
LOANCODE | GRPCODE | ACCODE | VCHDATE | LANAMT
---------|---------|--------|--------- |--------
1 | 1 | 2 |2009-01-06|2000
2 | 2 | 1 |2008-09-06|5000
Table RECPDTLS structure
TXNNO | LOANCODE | INSTAMT | INTAMT | VCHDATE
------|----------|---------|--------|---------
1 | 1 | 200 | 0 | 2009-02-06
2 | 1 | 200 | 10 | 2009-03-06
3 | 2 | 500 | 0 | 2008-10-06
4 | 2 | 1500 | 50 | 2009-03-28
5 | 2 | 500 | 0 | 2010-03-28
It will output something like this
GRPCODE | GRPNAME | ACCODE | ACNAME | ADMISSION DATE | LOANCODE | LOAN SANCTION DATE | LANAMT | REPAYMENT | INTREST | LAST PAYMENT ON
--------| --------| -------| ------ | ---------------| -------- | ------------------ | -------| ----------| ------- | --------------
1 | A | 2 | name2A | 2008-07-05 | 1 |2009-01-06 | 2000 | 400 | 10 | 2009-03-06
2 | B | 1 | name1B | 2007-07-06 | 2 |2008-09-06 | 5000 | 2000 | 50 | 2009-03-28
Thanks for the help.

You can replace the sub queries in your select statement with LEFT OUTER OR INNER JOIN depending on requirements. If all LOANCODE records will have matching RECPDTLS records then use INNER JOIN else use LEFT OUTER JOIN. Keep your aggregate functions the same.
...
Repayment=SUM(RECPDTLS.INSTAMT),
Interest=SUM(RECPDTLS.INTAMT),
LastPaymentOn=MAX(RECPDTLS.VCHDATE)
...
LEFT OUTER/INNER JOIN RECPDTLS ON RECPDTLS.LOANCODE = LOANMAST.LOANCODE AND Repayment.VCHDATE <= #HighDate
...
GROUP BY
GROUPMAS.GRPCODE, GROUPMAS.GRPNAME,
GRPDTLS.ACCODE, GRPDTLS.ACNAME, GRPDTLS.DOA,
LOANMAST.LOANCODE, LOANMAST.VCHDATE,
LOANMAST.LANAMT
You will need to run the query analyzer to see the efficiency gain between the old and the new queries.
NOTE : As I said above, be sure to use LET OUTER JOIN if the LOANCODE is not required to have a RECPDTLS as an INNER JOIN will only return matches in both tables.

You can use CTE to simplify the request :
;WITH LOANMASTAGG AS
(
SELECT SUM(r.INSTAMT) REPAYMENT, SUM(r.INTAMT) INTREST, MAX(r.VCHDATE) [LAST PAYMENT ON], l.LOANCODE, l.VCHDATE, l.LANAMT, l.ACCODE, l.GRPCODE
FROM #RECPDTLS r
INNER JOIN #LOANMAST l ON r.LOANCODE = l.LOANCODE
WHERE l.VCHDATE <= '2009-03-31'
GROUP BY l.LOANCODE, l.VCHDATE, l.LANAMT, l.ACCODE, l.GRPCODE
)
SELECT
g.GRPCODE,
g.GRPNAME,
gl.ACCODE,
gl.ACNAME,
gl.DOA "ADMISSION DATE",
la.LOANCODE,
la.VCHDATE "LOAN SANCTION DATE",
la.LANAMT,
la.REPAYMENT AS REPAYMENT,
la.INTREST AS INTREST,
la.[LAST PAYMENT ON] "LAST PAYMENT ON"
FROM LOANMASTAGG la
INNER JOIN #GRPDTLS gl ON gl.GRPCODE = la.GRPCODE AND gl.ACCODE = la.ACCODE
INNER JOIN #GROUPMAS g ON (g.GRPCODE = gl.GRPCODE)
WHERE gl.DOA <= '2009-03-31'

Related

Left join with 2 table giving results of 1 table only

I have 4 tables below:
Reasons_Trans:
RTId | RId | Label | Lang |
_________________________________________________________
1 | 1 | No car | English |
_________________________________________________________
2 | 2 | No fuel | English |
Reasons:
ReId | RId | Active| OfficeId
_________________________________________________________
1 | 1 | True | 1
_________________________________________________________
2 | 2 | True | 1
Employee_Reason:
ERId | RId | EmpId |
_________________________________________________________
1 | 1 | 1 |
_________________________________________________________
2 | 2 | 1 |
_________________________________________________________
3 | 0 | 1 |
Employee_Reason_Trad:
Id | ERId | Label | Lang
_________________________________________________________
1 | 3 | No in | English
_________________________________________________________
2 | 3 | Out in | English
I need to get the following:
For officeID = 1 (or any office), I need to get the Reasons of a particular employee + all reasosn available for that office.
So my query is as follows:
select rt.Label, ert.Label
from Reasons r
join Reasons_Trans rt
on r.RId = rt.RId and rt.Lang = 'English'
left join Employee_Reason er
on r.RId = er.RId
left join Employee_Reason_Trad ert
on er.ERId = ert.ERId and ert.Lang = 'English'
where er = 1 and r.OfficeId = 1
However this is not returning me the results of the Employee_Reason_Trad.
The output of the query should be:
No car, No fuel, No in, Out in
Any idea of what is wrong with my query?
Thanks for any help.
Not knowing the logic behind the data, but the expected result will be returned if you change your last join. Instead of on er.ERId = ert.ERId it must join on er.ERId = ert.Id:
select rt.Label, ert.Label
from Reasons r
join Reasons_Trans rt
on r.RId = rt.RId and rt.Lang = 'English'
left join Employee_Reason er
on r.RId = er.RId
left join Employee_Reason_Trad ert
on er.ERId = ert.Id and ert.Lang = 'English'
where er = 1 and r.OfficeId = 1
Returns:
Label Label
--------- -------
No car No in
No fuel Out in

How to find difference between two tables in MSSQL

I have got two tables 'Customer'.
The first one:
ID | UserID | Date
1. | 1 | 2018-05-01
2. | 1 | 2018-05-02
The second one:
ID | UserID | Date
1. | 1 | 2018-05-01
2. | 1 | 2018-05-02
3. | 1 | 2018-05-03
So, as you can see in the second table, there is one row more.
I have written so far this code:
;with cte_table1 as (
select UserID, count(id) cnt from db1.Customer group by UserID
),
cte_table2 as (
select UserID, count(id) cnt from db2.Customer group by UserID
)
select * from cte_table1 t1
join cte_table2 t2 on t2.UserID = t1.UserID
where t1.cnt <> t2.cnt
and this gives me expected result:
UserID | cnt | UserID | cnt
1 | 2 | 1 | 3
And so far, everything is fine. The thing is, these two tables have many rows and I'd like to have result with dates, where cnt does not match.
In other words, I'd like to have something like this:
UserID | cnt | Date | UserID | cnt | Date
1 | 2 | 2018-05-01 | 1 | 3 | 2018-05-01
1 | 2 | 2018-05-02 | 1 | 3 | 2018-05-01
1 | 2 | NULL | 1 | 3 | 2018-05-03
The best soulution would be resultset where both cte's are joined to give this:
UserID | cnt | Date | UserID | cnt | Date
1 | 2 | 2018-05-01 | 1 | 3 | 2018-05-01
1 | 2 | 2018-05-02 | 1 | 3 | 2018-05-01
1 | 2 | NULL | 1 | 3 | 2018-05-03
1 | 2 | 2018-05-30 | 1 | 3 | NULL
You should do a FULL OUTER JOIN query like below
Select
C1.UserID,
C1.cnt,
C1.Date,
C2.UserID,
C2.cnt,
C2.Date
from
db1.Customer C1
FULL OUTER JOIN
db2.Customer C2
on C1.UserId=C2.UserId and C1.date=C2.Date

Getting duplicates with additional information

I've inherited a database and I'm having trouble constructing a working SQL query.
Suppose this is the data:
[Products]
| Id | DisplayId | Version | Company | Description |
|---- |----------- |---------- |-----------| ----------- |
| 1 | 12345 | 0 | 16 | Random |
| 2 | 12345 | 0 | 2 | Random 2 |
| 3 | AB123 | 0 | 1 | Random 3 |
| 4 | 12345 | 1 | 16 | Random 4 |
| 5 | 12345 | 1 | 2 | Random 5 |
| 6 | AB123 | 0 | 5 | Random 6 |
| 7 | 12345 | 2 | 16 | Random 7 |
| 8 | XX45 | 0 | 5 | Random 8 |
| 9 | XX45 | 0 | 7 | Random 9 |
| 10 | XX45 | 1 | 5 | Random 10 |
| 11 | XX45 | 1 | 7 | Random 11 |
[Companies]
| Id | Code |
|---- |-----------|
| 1 | 'ABC' |
| 2 | '456' |
| 5 | 'XYZ' |
| 7 | 'XYZ' |
| 16 | '456' |
The Versioncolumn is a version number. Higher numbers indicate more recent versions.
The Company column is a foreign key referencing the Companies table on the Id column.
There's another table called ProductData with a ProductId column referencing Products.Id.
Now I need to find duplicates based on the DisplayId and the corresponding Companies.Code. The ProductData table should be joined to show a title (ProductData.Title), and only the most recent ones should be included in the results. So the expected results are:
| Id | DisplayId | Version | Company | Description | ProductData.Title |
|---- |----------- |---------- |-----------|------------- |------------------ |
| 5 | 12345 | 1 | 2 | Random 2 | Title 2 |
| 7 | 12345 | 2 | 16 | Random 7 | Title 7 |
| 10 | XX45 | 1 | 5 | Random 10 | Title 10 |
| 11 | XX45 | 1 | 7 | Random 11 | Title 11 |
because XX45 has 2 "entries": one with Company 5 and one with Company 7, but both companies share the same code.
because 12345 has 2 "entries": one with Company 2 and one with Company 16, but both companies share the same code. Note that the most recent version of both differs (version 2 for company 16's entry and version 1 for company 2's entry)
ABC123 should not be included as its 2 entries have different company codes.
I'm eager to learn your insights...
Based on your sample data, you just need to JOIN the tables:
SELECT
p.Id, p.DisplayId, p.Version, p.Company, d.Title
FROM Products AS p
INNER JOIN Companies AS c ON p.Company = c.Id
INNER JOIN ProductData AS d ON d.ProductId = p.Id;
But if you want the latest one, you can use the ROW_NUMBER():
WITH CTE
AS
(
SELECT
p.Id, p.DisplayId, p.Version, p.Company, d.Title,
ROW_NUMBER() OVER(PARTITION BY p.DisplayId,p.Company ORDER BY p.Id DESC) AS RN
FROM Products AS p
INNER JOIN Companies AS c ON p.Company = c.Id
INNER JOIN ProductData AS d ON d.ProductId = p.Id
)
SELECT *
FROM CTE
WHERE RN = 1;
sample fiddle
| Id | DisplayId | Version | Company | Title |
|----|-----------|---------|---------|----------|
| 5 | 12345 | 1 | 2 | Title 5 |
| 7 | 12345 | 2 | 16 | Title 7 |
| 10 | XX45 | 1 | 5 | Title 10 |
| 11 | XX45 | 1 | 7 | Title 11 |
If i understood you correctly, you can use CTE to find all the duplicated rows from your table, then you can just use SELECT from CTE and even add more manipulations.
WITH CTE AS(
SELECT Id,DisplayId,Version,Company,Description,ProductData.Title
RN = ROW_NUMBER()OVER(PARTITION BY DisplayId, Company ORDER BY p.Id DESC)
FROM dbo.YourTable1
)
SELECT *
FROM CTE
Try this:
SELECT b.ID,displayid,version,company,productdata.title
FROM
(select A.ID,a.displayid,version,a.company,rn,a.code, COUNT(displayid) over (partition by displayid,code) cnt from
(select Prod.ID,displayid,version,company,Companies.code, Row_number() over (partition by displayid,company order by version desc) rn
from Prod inner join Companies on Prod.Company = Companies.id) a
where a.rn=1) b inner join productdata on b.id = productdata.id where cnt =2
You have to first get the current version and then you see how many times the DisplayID + Code show-up. Then based on that you can select only the ones that have a count greater than one. You can then INNER JOIN ProductData on the final query to get the Title.
WITH
MaxVersion AS --Get the current versions
(
SELECT
MAX(Version) AS Version,
DisplayID,
Company
FROM
#TmpProducts
GROUP BY
DisplayID,
Company
)
,CTE AS
(
SELECT
p.DisplayID,
c.Code,
COUNT(*) AS RowCounter
FROM
#TmpProducts p
INNER JOIN
#TmpCompanies c
ON
c.ID = p.Company
INNER JOIN
MaxVersion mv
ON
mv.DisplayID = p.DisplayID
AND mv.Version = p.Version
AND mv.Company = p.Company
GROUP BY
p.DisplayID,
c.Code
)
SELECT
p.*
FROM
#TmpProducts p
INNER JOIN
CTE c
ON
c.DisplayID = p.DisplayID
INNER JOIN
MaxVersion mv
ON
mv.DisplayID = p.DisplayID
AND mv.Company = p.Company
AND mv.Version = p.Version
WHERE
c.RowCounter > 1

Where to use Outer Apply

MASTER TABLE
x------x--------------------x
| Id | Name |
x------x--------------------x
| 1 | A |
| 2 | B |
| 3 | C |
x------x--------------------x
DETAILS TABLE
x------x--------------------x-------x
| Id | PERIOD | QTY |
x------x--------------------x-------x
| 1 | 2014-01-13 | 10 |
| 1 | 2014-01-11 | 15 |
| 1 | 2014-01-12 | 20 |
| 2 | 2014-01-06 | 30 |
| 2 | 2014-01-08 | 40 |
x------x--------------------x-------x
I am getting the same results when LEFT JOIN and OUTER APPLY is used.
LEFT JOIN
SELECT T1.ID,T1.NAME,T2.PERIOD,T2.QTY
FROM MASTER T1
LEFT JOIN DETAILS T2 ON T1.ID=T2.ID
OUTER APPLY
SELECT T1.ID,T1.NAME,TAB.PERIOD,TAB.QTY
FROM MASTER T1
OUTER APPLY
(
SELECT ID,PERIOD,QTY
FROM DETAILS T2
WHERE T1.ID=T2.ID
)TAB
Where should I use LEFT JOIN AND where should I use OUTER APPLY
A LEFT JOIN should be replaced with OUTER APPLY in the following situations.
1. If we want to join two tables based on TOP n results
Consider if we need to select Id and Name from Master and last two dates for each Id from Details table.
SELECT M.ID,M.NAME,D.PERIOD,D.QTY
FROM MASTER M
LEFT JOIN
(
SELECT TOP 2 ID, PERIOD,QTY
FROM DETAILS D
ORDER BY CAST(PERIOD AS DATE)DESC
)D
ON M.ID=D.ID
which forms the following result
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | NULL | NULL |
| 3 | C | NULL | NULL |
x------x---------x--------------x-------x
This will bring wrong results ie, it will bring only latest two dates data from Details table irrespective of Id even though we join with Id. So the proper solution is using OUTER APPLY.
SELECT M.ID,M.NAME,D.PERIOD,D.QTY
FROM MASTER M
OUTER APPLY
(
SELECT TOP 2 ID, PERIOD,QTY
FROM DETAILS D
WHERE M.ID=D.ID
ORDER BY CAST(PERIOD AS DATE)DESC
)D
Here is the working : In LEFT JOIN , TOP 2 dates will be joined to the MASTER only after executing the query inside derived table D. In OUTER APPLY, it uses joining WHERE M.ID=D.ID inside the OUTER APPLY, so that each ID in Master will be joined with TOP 2 dates which will bring the following result.
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | 2014-01-08 | 40 |
| 2 | B | 2014-01-06 | 30 |
| 3 | C | NULL | NULL |
x------x---------x--------------x-------x
2. When we need LEFT JOIN functionality using functions.
OUTER APPLY can be used as a replacement with LEFT JOIN when we need to get result from Master table and a function.
SELECT M.ID,M.NAME,C.PERIOD,C.QTY
FROM MASTER M
OUTER APPLY dbo.FnGetQty(M.ID) C
And the function goes here.
CREATE FUNCTION FnGetQty
(
#Id INT
)
RETURNS TABLE
AS
RETURN
(
SELECT ID,PERIOD,QTY
FROM DETAILS
WHERE ID=#Id
)
which generated the following result
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-11 | 15 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | 2014-01-06 | 30 |
| 2 | B | 2014-01-08 | 40 |
| 3 | C | NULL | NULL |
x------x---------x--------------x-------x
3. Retain NULL values when unpivoting
Consider you have the below table
x------x-------------x--------------x
| Id | FROMDATE | TODATE |
x------x-------------x--------------x
| 1 | 2014-01-11 | 2014-01-13 |
| 1 | 2014-02-23 | 2014-02-27 |
| 2 | 2014-05-06 | 2014-05-30 |
| 3 | NULL | NULL |
x------x-------------x--------------x
When you use UNPIVOT to bring FROMDATE AND TODATE to one column, it will eliminate NULL values by default.
SELECT ID,DATES
FROM MYTABLE
UNPIVOT (DATES FOR COLS IN (FROMDATE,TODATE)) P
which generates the below result. Note that we have missed the record of Id number 3
x------x-------------x
| Id | DATES |
x------x-------------x
| 1 | 2014-01-11 |
| 1 | 2014-01-13 |
| 1 | 2014-02-23 |
| 1 | 2014-02-27 |
| 2 | 2014-05-06 |
| 2 | 2014-05-30 |
x------x-------------x
In such cases an APPLY can be used(either CROSS APPLY or OUTER APPLY, which is interchangeable).
SELECT DISTINCT ID,DATES
FROM MYTABLE
OUTER APPLY(VALUES (FROMDATE),(TODATE))
COLUMNNAMES(DATES)
which forms the following result and retains Id where its value is 3
x------x-------------x
| Id | DATES |
x------x-------------x
| 1 | 2014-01-11 |
| 1 | 2014-01-13 |
| 1 | 2014-02-23 |
| 1 | 2014-02-27 |
| 2 | 2014-05-06 |
| 2 | 2014-05-30 |
| 3 | NULL |
x------x-------------x
In your example queries the results are indeed the same.
But OUTER APPLY can do more: For each outer row you can produce an arbitrary inner result set. For example you can join the TOP 1 ORDER BY ... row. A LEFT JOIN can't do that.
The computation of the inner result set can reference outer columns (like your example did).
OUTER APPLY is strictly more powerful than LEFT JOIN. This is easy to see because each LEFT JOIN can be rewritten to an OUTER APPLY just like you did. It's syntax is more verbose, though.

how to check date records with in fdate and tdate range in sql server

Hi friends i have small doubt in sql server
here i want data based on condition
same id and status is equal to s then that date value be
how to write query in sql server
Table :emp
id |status |date(mm-dd-yy) |fdate(mm-dd-yy) |tdate(mm-dd-yy)
1 | S |03-16-11 | |
1 | b | | 03-15-11 |03-18-11
1 | s |03-17-11 | |
1 | b | | 04-20-12 |04-30-12
1 | S |04-20-12 | |
1 | s |04-10-12 | |
1 | s |10-01-14 | |
1 | b | |10-02-14 |10-25-14
2 | s |01-18-12 | |
2 | b | |01-18-12 |01-28-12
2 | b | |03-10-13 |03-24-13
2 | s |03-16-13 | |
2 | s |03-10-13 | |
2 | s |03-23-13 | |
2 | b | |04-20-13 |04-27-13
2 | s |07-01-14 | |
the table (status = s, id, date) compare it with status = b, same id number and date ( Date value from status s) with the date range of fdate and tdate .
if that data with in range then Billing yes other wise billing no
output like
id |status |date(mm-dd-yy) |fdate(mm-dd-yy) |tdate(mm-dd-yy) |Billing
1 | S |03-16-11 | | |yes
1 | s |03-17-11 | | |yes
1 | S |04-20-12 | | |yes
1 | s |04-10-12 | | |no
1 | s |10-01-14 | | |no
2 | s |01-18-12 | | |yes
2 | s |03-16-13 | | |yes
2 | s |03-10-13 | | |yes
2 | s |03-23-13 | | |yes
2 | s |07-01-14 | | |no
i tried query like below
select *
from ( select * from emp a where status ='s') a
inner join (select * from emp b where status='b') b
on a.pn=b.pn
where a.date<=b.date1 and a.date>=b.date2
its not give exactely result.
please tell me how to write query in sql server .
Try
select a.Id,
a.status,
a.date,
a.fdate,
a.tdate,
max(IsNull(case when a.date between b.fDate and b.tDate
then 'yes'
else 'no'
end, 'no')) Billing
from emp a
left join emp b
on a.Id=b.Id
where a.status ='s'
and b.status = 'b'
group by a.Id,
a.status,
a.date,
a.fdate,
a.tdate
Some questions/comments:
What are the fields: pn, date1 and date2?
date1 in your query is, I guess, bigger than date2

Resources