Only show value of Max rows with partition by? - sql-server

the title might be a bit off however i'm trying to remove the values of a row without removing the actual row.
This is my table:
SELECT ID,CustomerID,Weight FROM Orders
What am i trying to accomplish is this:
The MAX() value of ID Group By CustomerID that would give me null values in Weight where max and group by is not set
Is it possible to do this in one line? with a partiton by?
Something like:
SELECT MAX(ID) over (partition by CustomerID,Weight).... I know this is wrong but if possible to do without a join or CTE and only in one line in the select statement that would be great.

One possible approach is using ROW_NUMBER:
SELECT
ID,
CustomerID,
CASE
WHEN ROW_NUMBER() OVER (PARTITION BY CustomerId ORDER BY ID DESC) = 1 THEN [Weight]
ELSE Null
END AS [Weight]
FROM #Orders
ORDER BY ID
Input:
CREATE TABLE #Orders (
ID int,
CustomerID int,
[Weight] int
)
INSERT INTO #Orders
(ID, CustomerID, [Weight])
VALUES
(1, 11, 100),
(2, 11, 17),
(3, 11, 35),
(4, 22, 26),
(5, 22, 78),
(6, 22, 10030)
Output:
ID CustomerID Weight
1 11 NULL
2 11 NULL
3 11 35
4 22 NULL
5 22 NULL
6 22 10030

Try this
;WITH CTE
AS
(
SELECT
MAX_ID = MAX(ID) OVER(PARTITION BY CustomerId),
ID,
CustomerId,
Weight
FROM Orders
)
SELECT
ID,
CustomerId,
Weight = CASE WHEN ID = MAX_ID THEN Weight ELSE NULL END
FROM CTE

You can try this.
SELECT ID,CustomerId,CASE WHEN ID= MAX(ID) OVER(PARTITION BY CustomerId) THEN Weight ELSE NULL END AS Weight FROM Orders

Related

Snowflake : IN operator

so I want something as below in my query
select * from table a
where a.id in(select id, max(date) from table a group by id)
I am getting error here , as IN is equivalent to = .
how to do it?
example :
id
date
1
2022-31-01
1
2022-21-03
2
2022-01-01
2
2022-02-01
I need to get only one record based on date(max). The table has more columns than just id and date
so I need to something like this in snowflake
select * from table a
where id in(select id,max(date) from table a group by id)
```-----------------------
All solutions are working , if i select from table .
but i have case statement in view where duplicate records are coming
example :
create or replace view v_test
as
select * from
(
select id,lastdatetime,*,
case when start_date < timestamp and timestamp < end
and move_date = '9999-12-31' then 'Y'
else 'N' end as IND
from table a
) a
so if any one select view where IND= 'Y', more than 1 records are coming
what i want is to select latest records for ID where IND='Y' and max(lastdatetime)
how to incorporate this logic in view?
I think you are trying to get the latest record for each id?
select *
from table a
qualify row_number() over (partition by id order by date desc) = 1
So if we look at your sub-select:
using this "data" for the examples:
with data (id, _date) as (
select column1, to_date(column2, 'yyyy-dd-mm') from values
(1, '2022-31-01'),
(1, '2022-21-03'),
(2, '2022-01-01'),
(2, '2022-02-01')
)
select id, max(_date)
from data
group by 1;
it gives:
ID
MAX(_DATE)
1
2022-03-21
2
2022-01-02
which makes it seem you want the "the last date, per id"
which can classically (ansi sql) be written:
select d.*
from data as d
join (
select
id,
max(_date) as max_date
from data
group by 1
) as c
on d.id = c.id and d._date = c.max_date
;
ID
_DATE
1
2022-03-21
2
2022-01-02
which gives you "all the rows values". BUT if you have many rows with the same last date, you will get those, in the output.
Another methods is to use a ROW_NUMBER to pick one and only one row, which is the style of answer Mike has given:
with data (id, _date, extra) as (
select column1, to_date(column2, 'yyyy-dd-mm'), column3 from values
(1, '2022-31-01', 'extra_a'),
(1, '2022-21-03', 'extra_b_double_a'),
(1, '2022-21-03', 'extra_b_double_b'),
(2, '2022-01-01', 'extra_c'),
(2, '2022-02-01', 'extra_d')
)
select *
from data
qualify row_number() over (partition by id order by _date desc) =1 ;
gives:
ID
_DATE
EXTRA
1
2022-03-21
extra_b_double_a
2
2022-01-02
extra_d
now if you want the "all rows of the last day" you method works, albeit the QUALIFY/ROW_NUMBER is faster. You can use RANK
with data (id, _date, extra) as (
select column1, to_date(column2, 'yyyy-dd-mm'), column3 from values
(1, '2022-31-01', 'extra_a'),
(1, '2022-21-03', 'extra_b_double_a'),
(1, '2022-21-03', 'extra_b_double_b'),
(2, '2022-01-01', 'extra_c'),
(2, '2022-02-01', 'extra_d')
)
select *
from data
qualify dense_rank() over (partition by id order by _date desc) =1 ;
ID
_DATE
EXTRA
1
2022-03-21
extra_b_double_a
1
2022-03-21
extra_b_double_b
2
2022-01-02
extra_d
Now the last thing that it almost seems you are asking for, is "how do find the ID with the most recent data (here 1) and get all rows for that"
with data (id, _date, extra) as (
select column1, to_date(column2, 'yyyy-dd-mm'), column3 from values
(1, '2022-31-01', 'extra_a'),
(1, '2022-21-03', 'extra_b_double_a'),
(1, '2022-21-03', 'extra_b_double_b'),
(2, '2022-01-01', 'extra_c'),
(2, '2022-02-01', 'extra_d')
)
select *
from data
qualify id = last_value(id) over (order by _date);
Here is an example of how to use the in operator with a subquery:
select * from table1 t1 where t1.id in (select t2.id from table2 t2);
Usage of IN is possible to match on both columns:
select *
from tab AS a
where (a.id, a.date) in (select id, max(date) from tab group by id);
For sample data:
CREATE TABLE tab (id, date)
AS
SELECT column1, to_date(column2, 'yyyy-dd-mm')
FROM VALUES
(1, '2022-31-01'),
(1, '2022-21-03'),
(2, '2022-01-01'),
(2, '2022-02-01');
Output:

TSQL: Group by one column, count all rows and keep value on second column based on row_number

I have a query that returns an Id, a Name and the Row_Number() based on some rules.
The query looks like that
SELECT
tm.id AS Id,
pn.Name AS Name,
ROW_NUMBER() OVER(PARTITION BY tm.id ORDER BY tm.CreatedDate ASC) AS Row
FROM
#tempTable AS tm
LEFT JOIN
names pn WITH (NOLOCK) ON tm.nameId = pn.NameId
WHERE ....
The output of the above query looks like the table below with the dummy data
CREATE TABLE people
(
id int,
name varchar(55),
row int
);
INSERT INTO people
VALUES (1, 'John', 1), (1, 'John', 2), (2, 'Mary', 1),
(3, 'Jeff', 1), (4, 'Bill', 1), (4, 'Bill', 2),
(4, 'Bill', 3), (4, 'Billy', 4), (5, 'Bobby', 1),
(5, 'Bob', 2), (5, 'Bob' , 3), (5, 'Bob' , 4);
What I try to do, is group by the id field, count all rows, but for the name, use the one with row = 1
My attempt is like this, but, obviously, I get different rows since I include the x.name in the group by.
SELECT
x.id,
x.name,
COUNT(*) AS Value
FROM
(SELECT
tm.id AS Id,
pn.Name AS Name,
ROW_NUMBER() OVER(PARTITION BY tm.id ORDER BY tm.CreatedDate ASC) AS Row
FROM
#tempTable AS tm
LEFT JOIN
names pn WITH (NOLOCK) ON tm.nameId = pn.NameId
WHERE ....
) x
GROUP BY
x.id, x.name
ORDER BY
COUNT(*) DESC
The desired results from the dummy data are:
id name count
------------------
1 John 2
2 Mary 1
3 Jeff 1
4 Bill 4
5 Bobby 4
You can use FIRST_VALUE() window function to get the name of the row with row number = 1 and with the keyword DISTINCT there is no need to GROUP BY:
SELECT DISTINCT tm.id AS Id
, FIRST_VALUE(pn.Name) OVER (PARTITION BY tm.id ORDER BY tm.CreatedDate ASC) AS Name
, COUNT(*) OVER (PARTITION BY tm.id) AS counter
FROM #tempTable AS tm
LEFT JOIN names pn WITH (NOLOCK) ON tm.nameId = pn.NameId
WHERE ....
If you can't use FIRST_VALUE() then you can do it with conditional aggregation:
SELECT id,
MAX(CASE WHEN Row = 1 THEN Name END) AS NAME,
COUNT(*) AS Counter
FROM (
SELECT tm.id AS Id
, pn.Name AS Name
, ROW_NUMBER() OVER(PARTITION BY tm.id ORDER BY tm.CreatedDate ASC) AS Row
FROM #tempTable AS tm
LEFT JOIN names pn WITH (NOLOCK) ON tm.nameId = pn.NameId
WHERE ....
) t
GROUP BY id
This could be one solution to your problem: group on both id and the target name (case when p.row = 1 then p.name end) for the counting. Adding a with rollup to the grouping will "roll up" the count aggregations. Another aggregation on just id can then be use to merge the row values from the intermediate data set (visible in fiddle).
with cte as
(
select p.id,
case when p.row = 1 then p.name end as name,
count(1) as cnt
from people p
group by p.id, case when p.row = 1 then p.name end with rollup
having grouping(p.id) = 0
)
select cte.id,
max(cte.name) as name,
max(cte.cnt) as [count]
from cte
group by cte.id;
Fiddle
This would be another solution: do a regular count query with grouping on id and fetch the required name afterwards with a cross apply.
with cte as
(
select p.id,
count(1) as cnt
from people p
group by p.id
)
select cte.id,
n.name,
cte.cnt as [count]
from cte
cross apply ( select p.name
from people p
where p.id = cte.id
and p.row = 1 ) n;
Fiddle

Query to get date rows older than a start date (not a simple WHERE)

I have a feeling this is quite simple, but I can't put my finger on the query. I'm trying to find all of the activities of an employee which corresponds to their start date in a specific location.
create table Locations (EmployeeID int, LocationID int, StartDate date);
create table Activities (EmployeeID int, ActivityID int, [Date] date);
insert into Locations values
(1, 10, '01-01-2010')
, (1, 11, '01-01-2012')
, (1, 11, '01-01-2013');
insert into Activities values
(1, 1, '02-01-2010')
, (1, 2, '04-01-2010')
, (1, 3, '06-06-2014');
Expected result:
EmployeeID LocationID StartDate EmployeeID ActivityID Date
1 10 '01-01-2010' 1 1 '02-01-2010'
1 10 '01-01-2010' 1 2 '04-01-2010'
1 11 '01-01-2013' 1 3 '06-06-2014'
So far, I have this, but it's not quite giving me the result I was hoping for. I somehow have to reference only the information from the most recent Location, which the la.StartDate <= a.Date does not filter out and includes information from older locations as well.
select *
from Locations la
inner join Activities a on la.EmployeeID = a.EmployeeID
and la.StartDate <= a.Date
Give this one a try:
with Locations as (
select
*
from (values
(1, 10, '01-01-2010')
, (1, 11, '01-01-2012')
, (1, 11, '01-01-2013')
) la (EmployeeID, LocationID, StartDate)
),
Activities as (
select
*
from (
values
(1, 1, '02-01-2010')
, (1, 2, '04-01-2010')
, (1, 3, '06-06-2014')
) a (EmployeeID, ActivityID, [Date])
)
select
la.*,
a.*
from Activities a
cross apply (
select
*
from (
select
la.*,
ROW_NUMBER() OVER (
PARTITION BY
EMPLOYEEID
ORDER BY
DATE DESC
) seqnum
from Locations la
where
la.EmployeeID = a.EmployeeID and
la.StartDate <= a.Date
) la
where
la.seqnum = 1
) la
Thank you all, but I managed to find the answer:
select *
from LocationAssociations la
inner join Activities a on la.EmployeeID = a.EmployeeID
and la.StartDate = (select max(StartDate) from LocationAssociations where StartDate >= la.StartDate and StartDate <= a.Date)

Sort by most recent but keep together by another ID column

I am trying to get some sorting and keep together (not really grouping) working.
In my sample data I would like to keep the DealerIDs together, sorted by IsPrimaryDealer DESC, but show the group (ok maybe it is grouping) of dealers by the ones with the most recent entry.
Result set 2 is the closest, but Grant and his brother should be displayed as the first two rows, in that order. (Grant should be row 1, Grants Brother row 2 because Grants Brother was the most recently added)
DECLARE #temp TABLE (
DealerPK int not null IDENTITY(1,1), DealerID int,
IsPrimaryDealer bit, DealerName varchar(50), DateAdded datetime
)
INSERT INTO #temp VALUES
(1, 1, 'Bob', GETDATE() - 7),
(2, 1, 'Robert', GETDATE() - 7),
(3, 1, 'Grant', GETDATE() - 7),
(3, 0, 'Grants Brother', GETDATE() - 1),
(2, 0, 'Roberts Nephew', GETDATE() - 2),
(1, 0, 'Bobs Cousin', GETDATE() - 3)
-- Data As Entered
SELECT * FROM #temp
-- Data Attempt at Row Numbering
SELECT *, intPosition =
ROW_NUMBER() OVER (PARTITION BY IsPrimaryDealer ORDER BY DealerID, IsPrimaryDealer DESC)
FROM #temp
ORDER BY DateAdded DESC
-- Data Attempt By DateAdded
SELECT *, intPosition =
ROW_NUMBER() OVER (PARTITION BY DealerID ORDER BY DateAdded DESC)
FROM #temp
ORDER BY intPosition, DateAdded
Expected Result
PK DID IsPr Name DateAdded
3 3 1 Grant 2015-10-08 17:14:26.497
4 3 0 Grants Brother 2015-10-14 17:14:26.497
2 2 1 Robert 2015-10-08 17:14:26.497
5 2 0 Roberts Nephew 2015-10-13 17:14:26.497
1 1 1 Bob 2015-10-08 17:14:26.497
6 1 0 Bobs Cousin 2015-10-12 17:14:26.497
As requested by OP:
;WITH Cte AS(
SELECT *,
mx = MAX(DateAdded) OVER(PARTITION BY DealerID) FROM #temp
)
SELECT *
FROM Cte
ORDER BY mx DESC, DealerID, IsPrimaryDealer DESC
Hope i understood your question,
This query results expected output :
SELECT Row_number()
OVER (
PARTITION BY DealerID
ORDER BY DealerPK)RN,
DealerPK,
DealerID,
IsPrimaryDealer,
DealerName,
DateAdded
FROM #temp
ORDER BY DealerID DESC

SQL - Filter on dates X number of days apart from the previous

I have a table containing orders. I would like to select those orders that are a certain number of days apart for a specific client. For example, in the table below I would like to select all of the orders for CustomerID = 10 that are at least 30 days apart from the previous instance. With the starting point to be the first occurrence (07/05/2014 in this data).
OrderID | CustomerID | OrderDate
==========================================
1 10 07/05/2014
2 10 07/15/2014
3 11 07/20/2014
4 11 08/20/2014
5 11 09/21/2014
6 10 09/23/2014
7 10 10/15/2014
8 10 10/30/2014
I would want to select OrderIDs (1,6,8) since they are 30 days apart from each other and all from CustomerID = 10. OrderIDs 2 and 7 would not be included as they are within 30 days of the previous order for that customer.
What confuses me is how to set the "checkpoint" to the last valid date. Here is a little "pseudo" SQL.
SELECT OrderID
FROM Orders
WHERE CusomerID = 10
AND OrderDate > LastValidOrderDate + 30
i came here and i saw #SveinFidjestøl already posted answer but i can't control my self after by long tried :
with the help of LAG and LEAD we can comparison between same column
and as per your Q you are looking 1,6,8. might be this is helpful
SQL SERVER 2012 and after
declare #temp table
(orderid int,
customerid int,
orderDate date
);
insert into #temp values (1, 10, '07/05/2014')
insert into #temp values (2, 10, '07/15/2014')
insert into #temp values (3, 11, '07/20/2014')
insert into #temp values (4, 11, '08/20/2014')
insert into #temp values (5, 11, '09/21/2014')
insert into #temp values (6, 10, '09/23/2014')
insert into #temp values (7, 10, '10/15/2014')
insert into #temp values (8, 10, '10/30/2014');
with cte as
(SELECT orderid,customerid,orderDate,
LAG(orderDate) OVER (ORDER BY orderid ) PreviousValue,
LEAD(orderDate) OVER (ORDER BY orderid) NextValue,
rownum = ROW_NUMBER() OVER (ORDER BY orderid)
FROM #temp
WHERE customerid = 10)
select orderid,customerid,orderDate from cte
where DATEDIFF ( day , PreviousValue , orderDate) > 30
or PreviousValue is null or NextValue is null
SQL SERVER 2005 and after
WITH CTE AS (
SELECT
rownum = ROW_NUMBER() OVER (ORDER BY p.orderid),
p.orderid,
p.customerid,
p.orderDate
FROM #temp p
where p.customerid = 10)
SELECT CTE.orderid,CTE.customerid,CTE.orderDate,
prev.orderDate PreviousValue,
nex.orderDate NextValue
FROM CTE
LEFT JOIN CTE prev ON prev.rownum = CTE.rownum - 1
LEFT JOIN CTE nex ON nex.rownum = CTE.rownum + 1
where CTE.customerid = 10
and
DATEDIFF ( day , prev.orderDate , CTE.orderDate) > 30
or prev.orderDate is null or nex.orderDate is null
GO
You can use the LAG() function, available in SQL Server 2012, together with a Common Table Expression. You calculate the days between the customer's current order and the customer's previous order and then query the Common Table Expression using the filter >= 30
with cte as
(select OrderId
,CustomerId
,datediff(d
,lag(orderdate) over (partition by CustomerId order by OrderDate)
,OrderDate) DaysSinceLastOrder
from Orders)
select OrderId, CustomerId, DaysSinceLastOrder
from cte
where DaysSinceLastOrder >= 30 or DaysSinceLastOrder is null
Results:
OrderId CustomerId DaysSinceLastOrder
1 10 NULL
6 10 70
3 11 NULL
4 11 31
5 11 32
(Note that 1970-01-01 is chosen arbitrarily, you may choose any date)
Update
A slighty more reliable way of doing it will involve a temporary table. But the original table tbl can be left unchanged. See here:
CREATE TABLE #tmp (id int); -- set-up temp table
INSERT INTO #tmp VALUES (1); -- plant "seed": first oid
WHILE (##ROWCOUNT>0)
INSERT INTO #tmp (id)
SELECT TOP 1 OrderId FROM tbl
WHERE OrderId>0 AND CustomerId=10
AND OrderDate>(SELECT max(OrderDate)+30 FROM tbl INNER JOIN #tmp ON id=OrderId)
ORDER BY OrderDate;
-- now list all found entries of tbl:
SELECT * FROM tbl WHERE EXISTS (SELECT 1 FROM #tmp WHERE id=OrderId)
#tinka shows how to use CTEs to do the trick, and the new windowed functions (for 2012 and later) are probably the best answer. There is also the option, assuming you do not have a very large data set, to use a recursive CTE.
Example:
declare #customerid int = 10;
declare #temp table
(orderid int,
customerid int,
orderDate date
);
insert into #temp values (1, 10, '07/05/2014')
insert into #temp values (2, 10, '07/15/2014')
insert into #temp values (3, 11, '07/20/2014')
insert into #temp values (4, 11, '08/20/2014')
insert into #temp values (5, 11, '09/21/2014')
insert into #temp values (6, 10, '09/23/2014')
insert into #temp values (7, 10, '10/15/2014')
insert into #temp values (8, 10, '10/30/2014');
with datefilter AS
(
SELECT row_number() OVER(PARTITION BY CustomerId ORDER BY OrderDate) as RowId,
OrderId,
CustomerId,
OrderDate,
DATEADD(day, 30, OrderDate) as FilterDate
from #temp
WHERE CustomerId = #customerid
)
, firstdate as
(
SELECT RowId, OrderId, CustomerId, OrderDate, FilterDate
FROM datefilter
WHERE rowId = 1
union all
SELECT datefilter.RowId, datefilter.OrderId, datefilter.CustomerId,
datefilter.OrderDate, datefilter.FilterDate
FROM datefilter
join firstdate
on datefilter.CustomerId = firstdate.CustomerId
and datefilter.OrderDate > firstdate.FilterDate
WHERE NOT EXISTS
(
SELECT 1 FROM datefilter betweens
WHERE betweens.CustomerId = firstdate.CustomerId
AND betweens.orderdate > firstdate.FilterDate
AND datefilter.orderdate > betweens.orderdate
)
)
SELECT * FROM firstdate

Resources