How to retrieve the third most recent date grouped another column? - sql-server

EDIT: I am using SQL Server 2005
So here's a tricky one. For audit purposes, we need to make 3 attempts to contact a customer. We can make more than 3 attempts to go above and beyond, but audit purposes I need to retrieve the date of the third most recent attempt for each customer.
In most cases, you just need the most recent period, so you can do something like..
SELECT CustID,MAX(AttemptDate) FROM Attempts GROUP BY CustID
.. but that obviously won't work in this scenario.
Say I have a table of attempts that occur which are tied to a customer.
CustID AttemptDate
123 2014-01-02
123 2014-01-05
123 2014-01-06 * retrieve this one
123 2014-01-07
123 2014-01-10
555 2014-02-01
555 2014-02-03
555 2014-02-07 * retrieve this one
555 2014-02-12
555 2014-02-20
Output:
CustID AttemptDate
123 2014-01-06
555 2014-02-07
Any tips for pulling this off?

;WITH t AS (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY CustId ORDER BY AttemptDate DESC) AS nth_most_recent
FROM MyTable
)
SELECT *
FROM t
WHERE nth_most_recent = 3

The ROW_NUMBER ranking function is your friend here:
WITH cte (CustId, AttemptDate, AttemptNumber) AS (
SELECT
CustId,
AttemptDate,
ROW_NUMBER() OVER (PARTITION BY CustID ORDER BY AttemptDate DESC) AS AttemptNumber
FROM Attempts
)
SELECT
CustId,
AttemptDate
FROM cte
WHERE AttemptNumber = 3
Alternatively, if the common table expression syntax is causing problems, you could use a subquery:
SELECT
CustId,
AttemptDate
FROM (
SELECT
CustId,
AttemptDate,
ROW_NUMBER() OVER (PARTITION BY CustID ORDER BY AttemptDate DESC) AS AttemptNumber
FROM Attempts
) sq
WHERE AttemptNumber = 3

Related

Update gaps in sequential table

I have a table that contains employee bank data
Employee |Bank |Date |Delta
---------------------------------------------------
Smith |Vacation |2023-01-01 |15.0
Smith |Vacation |2023-01-02 |Null
Smith |Vacation |2023-01-03 |Null
Smith |Vacation |2023-01-04 |7.5
I would like to write a statement so that I can update 2023-01-02 and 2023-01-03 with the Delta value from January 1. Essentially, I want to use the value from the most recent row that isn't > than the date on the row.
Once complete, I want the table to look like this:
Employee |Bank |Date |Delta
---------------------------------------------------
Smith |Vacation |2023-01-01 |15.0
Smith |Vacation |2023-01-02 |15.0
Smith |Vacation |2023-01-03 |15.0
Smith |Vacation |2023-01-04 |7.5
The source table has a unique index consisting of Employee, Bank and Date descending. There could be up to 2 billion rows in the table.
I currently update the table with the following, but I am wondering if there is a more efficient way to do so?
WITH cte_date
AS (SELECT dd.date_key,
db.balance_key,
feb.employee_key
FROM shared.dim_date dd
CROSS JOIN
(
SELECT DISTINCT
employee_key
FROM wfms.fact_employee_balance
) feb
CROSS JOIN wfms.dim_balance db
WHERE dd.date BETWEEN DATEFROMPARTS(DATEPART(YY, GETDATE()) - 2, 12, 31) AND GETDATE())
SELECT dd.*,
t.delta
INTO wfms.test2
FROM cte_date dd
LEFT JOIN wfms.test1 t ON dd.balance_key = t.balance_key
AND dd.employee_key = t.employee_key
AND t.date_key = (SELECT TOP 1 tt1.date_key
FROM wfms.test1 tt1
WHERE tt1.balance_key = t.balance_key
AND tt1.employee_key = t.employee_key
AND tt1.date_key < dd.date_key);
Just for fun, I wanted to test an idea.
For the moment, lets assume the gaps are not too wide ... In this example 7 days.
On a relative to batch, the lag() over() approach was 22% while the Cross Apply was 78%.
Again, Just for fun
Select Employee
,Bank
,Date
,Delta = coalesce(A.Delta
,lag(Delta,1) over (partition by Employee,Bank order by date)
,lag(Delta,2) over (partition by Employee,Bank order by date)
,lag(Delta,3) over (partition by Employee,Bank order by date)
,lag(Delta,4) over (partition by Employee,Bank order by date)
,lag(Delta,5) over (partition by Employee,Bank order by date)
,lag(Delta,6) over (partition by Employee,Bank order by date)
,lag(Delta,7) over (partition by Employee,Bank order by date)
)
From YourTable A
Versus
Select Employee
,Bank
,Date
,Delta = coalesce(A.Delta,B.Delta)
From YourTable A
Cross Apply ( Select top 1 Delta
From YourTable
Where Employee=A.Employee
and A.Bank = Bank
and Delta is not null
and A.Date>=Date
Order By Date desc
) B
Update
Same results with 20 days
Here is another way. Using sum() with window function to find the group "Grp" of rows (1 row with not null with subsequent rows of null). Finally max(Delta) of the Grp to return the not null value.
select Employee, Bank, [Date], max (max(Delta))
over (partition by Employee, Bank, Grp)
from
(
select *, Grp = sum (case when Delta is not null then 1 else 0 end)
over (partition by Employee,Bank
order by [Date])
from YourTable
) t
group by Employee, Bank, [Date], Grp

Query to display employee names, dept no and highest salary in dept no wise

I need a query to display employee names, dept no and highest salary in dept no wise
Example
ENAME SAL dept no
KING 5000 10
FORD 3000 20
SCOTT 3000 20
BLAKE 2850 30
As the commenter said, it's helpful when you provide more detail about what you've tried.
Nonetheless, I think you're looking for something like this:
;with cte as (
select ename, sal, [dept no]
, row_number() over (partition by [dept no] order by sal desc, ename) rn
from your_table
)
select *
from cte
where rn = 1
Note that your example data shows the same SAL of 3000 for [dept no] 20. To attempt to break ties, I added the ename to the order by statement.
This query displays correctly
select ENAME, SAL, [deptno]
from (
select *
, DENSE_RANK() over(partition by [deptno] order by sal desc) as Highest_sal
from Employee
) a
where a.Highest_sal = 1

SQL Server - Group last paid price by customer and product code

I am trying to write a query that will do the following.
I have a table with separate sales order lines on it. Each line details the customer, product sold, the price sold at, and the date of the sale.
I am trying to establish for each product code, what the last price we sold it for was for each separate customer.
For example using my input below I would expect Product code ABC to return '10' for Brian, '20' for Gary, and '50 for Sam.
Below is complete set of results I would expect for all product codes.
Input
Order No Customer Product Code Price Date
-----------------------------------------------------------
1 Brian ABC 10 12/04/2018
2 Brian ABC 14 01/04/2018
3 Gary ABC 20 12/04/2018
4 Gary ABC 35 12/04/2017
5 Sam ABD 40 06/08/2017
6 Sam ABC 50 20/08/2017
7 Adam ABE 20 15/06/2016
8 Adam ABE 30 17/03/2017
Output
Order No Customer Product Code Price Date 1 Brian ABC 10 12/04/2018 3 Gary ABC 20 12/04/2018 6 Sam ABC 50 20/08/2017 5 Sam ABD 40 06/08/2017 8 Adam ABE 30 17/03/2017
You can you Row_number() with partition BY [product code], [customer] for this.
Following query should work for you scenario
SELECT *
FROM (SELECT [order no],
[customer],
[product code],
[price],
[date],
Row_number()
OVER(
partition BY [product code], [customer]
ORDER BY [date] DESC) AS RN
FROM [table]) T
WHERE T.rn = 1
This is all about the row_number():
Allows you to partition (Similar to group) and provide an order
select top 1 with ties *
from table
order by row_number() over (partition by [Product Code],Customer order by date desc)
I think this is what you are looking for:
select a.* from #temp a join (
select customer, [product code], max(Date) maxdate
from #temp
group by customer, [product code])b
on a.customer=b.customer and a.date=b.maxdate and a.[product code]=b.[product code]

SQL Server Query for account statement

How can I generate last three transactions from the below table?
Date Tran dr cr total
-------------------------------------
2017-04-13
2017-07-15
2017-07-15
2017-10-17
2017-10-17 abc 10 10
2017-11-12 def 10 20
2017-11-12 ghi 5 15
I'm using SQL Server 2012
Like this you should your expected result:
SELECT * FROM
(
SELECT TOP 3 *
FROM TransactionTable
ORDER BY [Date] DESC
) AS t
ORDER by t.[Date]
if your requirement is to get the 3 transactions with the latest date. you can use either of the following.
Simple Order by :
select
top 3
* from YourTable
where isnull(Tran,'')<>''
order by [Date] desc
using Row Number
;with cte
as
(
select
seqno = row_number() over(order by [date] desc),
*
from YourTable
where isnull(Tran,'')<>''
)
select
* from cte
where SeqNo <=3
order by SeqNo desc

How to use RANK() in SQL Server

I have a problem using RANK() in SQL Server.
Here’s my code:
SELECT contendernum,
totals,
RANK() OVER (PARTITION BY ContenderNum ORDER BY totals ASC) AS xRank
FROM (
SELECT ContenderNum,
SUM(Criteria1+Criteria2+Criteria3+Criteria4) AS totals
FROM Cat1GroupImpersonation
GROUP BY ContenderNum
) AS a
The results for that query are:
contendernum totals xRank
1 196 1
2 181 1
3 192 1
4 181 1
5 179 1
What my desired result is:
contendernum totals xRank
1 196 1
2 181 3
3 192 2
4 181 3
5 179 4
I want to rank the result based on totals. If there are same value like 181, then two numbers will have the same xRank.
Change:
RANK() OVER (PARTITION BY ContenderNum ORDER BY totals ASC) AS xRank
to:
RANK() OVER (ORDER BY totals DESC) AS xRank
Have a look at this example:
SQL Fiddle DEMO
You might also want to have a look at the difference between RANK (Transact-SQL) and DENSE_RANK (Transact-SQL):
RANK (Transact-SQL)
If two or more rows tie for a rank, each tied rows receives the same
rank. For example, if the two top salespeople have the same SalesYTD
value, they are both ranked one. The salesperson with the next highest
SalesYTD is ranked number three, because there are two rows that are
ranked higher. Therefore, the RANK function does not always return
consecutive integers.
DENSE_RANK (Transact-SQL)
Returns the rank of rows within the partition of a result set, without
any gaps in the ranking. The rank of a row is one plus the number of
distinct ranks that come before the row in question.
To answer your question title, "How to use Rank() in SQL Server," this is how it works:
I will use this set of data as an example:
create table #tmp
(
column1 varchar(3),
column2 varchar(5),
column3 datetime,
column4 int
)
insert into #tmp values ('AAA', 'SKA', '2013-02-01 00:00:00', 10)
insert into #tmp values ('AAA', 'SKA', '2013-01-31 00:00:00', 15)
insert into #tmp values ('AAA', 'SKB', '2013-01-31 00:00:00', 20)
insert into #tmp values ('AAA', 'SKB', '2013-01-15 00:00:00', 5)
insert into #tmp values ('AAA', 'SKC', '2013-02-01 00:00:00', 25)
You have a partition which basically specifies grouping.
In this example, if you partition by column2, the rank function will create ranks for groups of column2 values. There will be different ranks for rows where column2 = 'SKA' than rows where column2 = 'SKB' and so on.
The ranks are decided like this:
The rank for every record is one plus the number of ranks that come before it in its partition. The rank will only increment when one of the fields you selected (other than the partitioned field(s)) is different than the ones that come before it. If all of the selected fields are the same, then the ranks will tie and both will be assigned the value, one.
Knowing this, if we only wanted to select one value from each group in column two, we could use this query:
with cte as
(
select *,
rank() over (partition by column2
order by column3) rnk
from t
) select * from cte where rnk = 1 order by column3;
Result:
COLUMN1 | COLUMN2 | COLUMN3 |COLUMN4 | RNK
------------------------------------------------------------------------------
AAA | SKB | January, 15 2013 00:00:00+0000 |5 | 1
AAA | SKA | January, 31 2013 00:00:00+0000 |15 | 1
AAA | SKC | February, 01 2013 00:00:00+0000 |25 | 1
SQL DEMO
You have to use DENSE_RANK rather than RANK. The only difference is that it doesn't leave gaps. You also shouldn't partition by contender_num, otherwise you're ranking each contender in a separate group, so each is 1st-ranked in their segregated groups!
SELECT contendernum,totals, DENSE_RANK() OVER (ORDER BY totals desc) AS xRank FROM
(
SELECT ContenderNum ,SUM(Criteria1+Criteria2+Criteria3+Criteria4) AS totals
FROM dbo.Cat1GroupImpersonation
GROUP BY ContenderNum
) AS a
order by contendernum
A hint for using StackOverflow, please post DDL and sample data so people can help you using less of their own time!
create table Cat1GroupImpersonation (
contendernum int,
criteria1 int,
criteria2 int,
criteria3 int,
criteria4 int);
insert Cat1GroupImpersonation select
1,196,0,0,0 union all select
2,181,0,0,0 union all select
3,192,0,0,0 union all select
4,181,0,0,0 union all select
5,179,0,0,0;
DENSE_RANK() is a rank with no gaps, i.e. it is “dense”.
select Name,EmailId,salary,DENSE_RANK() over(order by salary asc) from [dbo].[Employees]
RANK()-It contain gap between the rank.
select Name,EmailId,salary,RANK() over(order by salary asc) from [dbo].[Employees]
You have already grouped by ContenderNum, no need to partition again by it.
Use Dense_rank()and order by totals desc.
In short,
SELECT contendernum,totals, **DENSE_RANK()**
OVER (ORDER BY totals **DESC**)
AS xRank
FROM
(
SELECT ContenderNum ,SUM(Criteria1+Criteria2+Criteria3+Criteria4) AS totals
FROM dbo.Cat1GroupImpersonation
GROUP BY ContenderNum
) AS a
SELECT contendernum,totals, RANK() OVER (ORDER BY totals ASC) AS xRank FROM
(
SELECT ContenderNum ,SUM(Criteria1+Criteria2+Criteria3+Criteria4) AS totals
FROM dbo.Cat1GroupImpersonation
GROUP BY ContenderNum
) AS a
RANK() is good, but it assigns the same rank for equal or similar values. And if you need unique rank, then ROW_NUMBER() solves this problem
ROW_NUMBER() OVER (ORDER BY totals DESC) AS xRank
Select T.Tamil, T.English, T.Maths, T.Total, Dense_Rank()Over(Order by T.Total Desc) as Std_Rank From (select Tamil,English,Maths,(Tamil+English+Maths) as Total From Student) as T
enter image description here

Resources