T-SQL : how to select row with multiple conditions - sql-server

I have below data set in SQL Server and I need to select the data with conditions in order below:
First, check to see if date_end is 1/1/2099, then select the row that has smallest days gap and skill_group is not SWAT for rows have same employee_id, in this case that is row 2.
Second, for rows that do not have 1/1/2099 date_end, select row that has most recent day date_end, in this case it's row 4.
ID employee_id last_name first_name date_start date_end skill_group
---------------------------------------------------------------------------
1 N05E0F Mike Pamela 12/19/2013 1/1/2099 SWAT
2 N05E0F Mike Pamela 9/16/2015 1/1/2099 Welcome Team
3 NSH8A David Smith 12/19/2013 9/16/2016 Unlicensed
4 NSH8A David Smith 8/16/2015 10/16/2016 CMT

There are many ways to do this. Here are some of them:
top with ties version:
select top 1 with ties
*
from tbl
where skill_group != 'SWAT'
order by
row_number() over (
partition by employee_id
order by date_end desc, datediff(day,date_start,date_end) asc
)
with common_table_expression as () using row_number() version:
with cte as (
select *
, rn = row_number() over (
partition by employee_id
order by date_end desc, datediff(day,date_start,date_end) asc
)
from tbl
where skill_group != 'SWAT'
)
select *
from cte
where rn = 1

Related

Update gaps in sequential table

I have a table that contains employee bank data
Employee |Bank |Date |Delta
---------------------------------------------------
Smith |Vacation |2023-01-01 |15.0
Smith |Vacation |2023-01-02 |Null
Smith |Vacation |2023-01-03 |Null
Smith |Vacation |2023-01-04 |7.5
I would like to write a statement so that I can update 2023-01-02 and 2023-01-03 with the Delta value from January 1. Essentially, I want to use the value from the most recent row that isn't > than the date on the row.
Once complete, I want the table to look like this:
Employee |Bank |Date |Delta
---------------------------------------------------
Smith |Vacation |2023-01-01 |15.0
Smith |Vacation |2023-01-02 |15.0
Smith |Vacation |2023-01-03 |15.0
Smith |Vacation |2023-01-04 |7.5
The source table has a unique index consisting of Employee, Bank and Date descending. There could be up to 2 billion rows in the table.
I currently update the table with the following, but I am wondering if there is a more efficient way to do so?
WITH cte_date
AS (SELECT dd.date_key,
db.balance_key,
feb.employee_key
FROM shared.dim_date dd
CROSS JOIN
(
SELECT DISTINCT
employee_key
FROM wfms.fact_employee_balance
) feb
CROSS JOIN wfms.dim_balance db
WHERE dd.date BETWEEN DATEFROMPARTS(DATEPART(YY, GETDATE()) - 2, 12, 31) AND GETDATE())
SELECT dd.*,
t.delta
INTO wfms.test2
FROM cte_date dd
LEFT JOIN wfms.test1 t ON dd.balance_key = t.balance_key
AND dd.employee_key = t.employee_key
AND t.date_key = (SELECT TOP 1 tt1.date_key
FROM wfms.test1 tt1
WHERE tt1.balance_key = t.balance_key
AND tt1.employee_key = t.employee_key
AND tt1.date_key < dd.date_key);
Just for fun, I wanted to test an idea.
For the moment, lets assume the gaps are not too wide ... In this example 7 days.
On a relative to batch, the lag() over() approach was 22% while the Cross Apply was 78%.
Again, Just for fun
Select Employee
,Bank
,Date
,Delta = coalesce(A.Delta
,lag(Delta,1) over (partition by Employee,Bank order by date)
,lag(Delta,2) over (partition by Employee,Bank order by date)
,lag(Delta,3) over (partition by Employee,Bank order by date)
,lag(Delta,4) over (partition by Employee,Bank order by date)
,lag(Delta,5) over (partition by Employee,Bank order by date)
,lag(Delta,6) over (partition by Employee,Bank order by date)
,lag(Delta,7) over (partition by Employee,Bank order by date)
)
From YourTable A
Versus
Select Employee
,Bank
,Date
,Delta = coalesce(A.Delta,B.Delta)
From YourTable A
Cross Apply ( Select top 1 Delta
From YourTable
Where Employee=A.Employee
and A.Bank = Bank
and Delta is not null
and A.Date>=Date
Order By Date desc
) B
Update
Same results with 20 days
Here is another way. Using sum() with window function to find the group "Grp" of rows (1 row with not null with subsequent rows of null). Finally max(Delta) of the Grp to return the not null value.
select Employee, Bank, [Date], max (max(Delta))
over (partition by Employee, Bank, Grp)
from
(
select *, Grp = sum (case when Delta is not null then 1 else 0 end)
over (partition by Employee,Bank
order by [Date])
from YourTable
) t
group by Employee, Bank, [Date], Grp

SQL Server Window Paging Based on # of Groups

Given the following table structure
Column
Id
Name
DateCreated
with the following data
id
Name
DateCreated
1
Joe
1/13/2021
2
Fred
1/13/2021
3
Bob
1/12/2021
4
Sue
1/12/2021
5
Sally
1/10/2021
6
Alex
1/9/2021
I need SQL that will page over the data based on datecreated. The query should return the top 3 records, and any record which also shares the datecreated of the top 3.
So give the data above, we should get back Joe, Fred and Bob (as the top 3 records) plus Sue since sue has the same date as Bob.
Is there something like ROW_NUMBER that increments for each row where it encounters a different value.
For some context this query is being used to generate an agenda type view, and once we select any date we want to keep all data for that date together.
EDIT
I do have a solution but it smells:
;WITH CTE AS ( SELECT ROW_NUMBER() OVER(ORDER BY DateCreated DESC) RowNum,CAST(DateCreated AS DATE) DateCreated,Name
FROM MyTable),
PAGE AS (SELECT *
FROM CTE
WHERE RowNum<=5)
SELECT *
FROM Page
UNION
SELECT *
FROM CTE
WHERE DateCreated=(SELECT MIN(DateCreated) FROM Page)
I've used a TOP 3 WITH TIES example and a ROW_NUMBER example and a CTE to return four records:
DROP TABLE IF EXISTS #tmp
GO
CREATE TABLE #tmp (
Id INT PRIMARY KEY,
name VARCHAR(20) NOT NULL,
dateCreated DATE
)
GO
INSERT INTO #tmp VALUES
( 1, 'Joe', '13 Jan 2021' ),
( 2, 'Fred', '13 Jan 2021' ),
( 3, 'Bob', '12 Jan 2021' ),
( 4, 'Sue', '12 Jan 2021' ),
( 5, 'Sally', '10 Jan 2021' ),
( 6, 'Alex', '9 Jan 2021' )
GO
-- Gets same result
SELECT TOP 3 WITH TIES *
FROM #tmp t
ORDER BY dateCreated DESC
;WITH cte AS (
SELECT ROW_NUMBER() OVER( ORDER BY dateCreated DESC ) rn, *
FROM #tmp
)
SELECT *
FROM #tmp t
WHERE EXISTS
(
SELECT *
FROM cte c
WHERE rn <=3
AND t.dateCreated = c.dateCreated
)
My results:
As #Charlieface, we only need to replace ROW_NUMBER with DENSE_RANK. So that the ROW_NUMBER will be tied according to the same value.
When we run the query:
SELECT DENSE_RANK () OVER(ORDER BY DateCreated DESC) RowNum,CAST(DateCreated AS DATE) DateCreated,Name
FROM MyTable
The result will show as follows:
So as a result, we can set RowNum<=3 in the query to get the top 3:
;WITH CTE AS ( SELECT DENSE_RANK() OVER(ORDER BY DateCreated DESC) RowNum,CAST(DateCreated AS DATE) DateCreated,Name
FROM MyTable),
PAGE AS (SELECT *
FROM CTE
WHERE RowNum<=3)
SELECT *
FROM Page
UNION
SELECT *
FROM CTE
WHERE DateCreated=(SELECT MIN(DateCreated) FROM Page)
The First one is as yours the second one is as above. The results of the two queries are the same.
Kindly let us know if you need more infomation.

SQL Server Query for account statement

How can I generate last three transactions from the below table?
Date Tran dr cr total
-------------------------------------
2017-04-13
2017-07-15
2017-07-15
2017-10-17
2017-10-17 abc 10 10
2017-11-12 def 10 20
2017-11-12 ghi 5 15
I'm using SQL Server 2012
Like this you should your expected result:
SELECT * FROM
(
SELECT TOP 3 *
FROM TransactionTable
ORDER BY [Date] DESC
) AS t
ORDER by t.[Date]
if your requirement is to get the 3 transactions with the latest date. you can use either of the following.
Simple Order by :
select
top 3
* from YourTable
where isnull(Tran,'')<>''
order by [Date] desc
using Row Number
;with cte
as
(
select
seqno = row_number() over(order by [date] desc),
*
from YourTable
where isnull(Tran,'')<>''
)
select
* from cte
where SeqNo <=3
order by SeqNo desc

SQL Server query should return max value records

I have table like this:
id_Seq_No emp_name Current_Property_value
-----------------------------------------------
1 John 100
2 Peter 200
3 Pollard 50
4 John 500
I want the max record value of particular employee.
For example, John has 2 records seq_no 1, 4. I want 4th seq_no Current_Property_Value in single query.
Select
max(id_Seq_No)
from
t1
where
emp_name = 'John'
To get the Current_Property_value, just order the results by id_Seq_No and get the first one:
SELECT
TOP 1 Current_Property_value
FROM
table
WHERE
emp_name = 'John'
ORDER BY
id_Seq_No DESC
this will give highest for all tied employees
select top 1 with ties
id_Seq_No,emp_name,Current_Property_value
from
table
order by
row_number() over (partition by emp_name order by Current_Property_value desc)
You can use ROW_NUMBER with CTE.
Query
;WITH CTE AS(
SELECT rn = ROW_NUMBER() OVER(
PARTITION BY emp_name
ORDER BY id_Seq_No DESC
), *
FROM your_table_name
WHERE emp_name = 'John'
)
SELECT * FROM CTE
WHERE rn = 1;

how to to get first top 6 records indifferent columns T-sql?

I got a situation to display first top 6 records. first 3 records in FirstCol and next 3 in SecondCol. My query is like this:
select top 6 [EmpName]
from [Emp ]
order by [Salary] Desc
Result:
[EmpName]
----------------------
Sam
Pam
Oliver
Jam
Kim
Nixon
But I want the result to look like this:
FirstCol SecondCol
Sam Jam
Pam Kim
Oliver Nixon
; WITH TOP_3 AS
(
select TOP 3 [EmpName]
,ROW_NUMBER() OVER (ORDER BY [Salary] Desc) rn
from [Emp ]
order by [Salary] Desc
),
Other3 AS
(
SELECT [EmpName]
,ROW_NUMBER() OVER (ORDER BY [Salary] Desc) rn
FROM Employees
ORDER BY [Salary] DESC OFFSET 3 ROWS FETCH NEXT 3 ROWS ONLY
)
SELECT T3.[EmpName] , O3.[EmpName]
FROM TOP_3 T3 INNER JOIN Other3 O3
ON T3.RN = O3.RN
ORDER BY T3.RN ASC
You can do this using several windowing functions, this is kind of ugly but it will get you the result that you want:
;with data as
(
-- get your Top 6
select top 6 empname, salary
from emp
order by salary desc
),
buckets as
(
-- use NTILE to split the six rows into 2 buckets
select empname,
nt = ntile(2) over(order by salary desc),
salary
from data
)
select
FirstCol = max(case when nt = 1 then empname end),
SecondCol = max(case when nt = 2 then empname end)
from
(
-- create a row number for each item in the buckets to return multiple rows
select empname,
nt,
rn = row_number() over(partition by nt order by salary desc)
from buckets
) d
group by rn;
See SQL Fiddle with Demo. This uses the function NTILE, this takes your dataset of six rows and splits it into two buckets - 3 rows in bucket 1 and 3 rows in bucket 2. The (2) inside the NTILE is used to determine the number of buckets.
Next I used row_number() to create a unique value for each row within each bucket, this allows you to return multiple rows for each column.

Resources