SQL Query Distinct two columns, Max(date) and retrieve ID - sql-server

I'm having trouble figuring out how to make this query work. I've tried everything under the sun to avoid looping.
The table has ID (pk), UserID, BookID, BookDate (datetime), and SellerID. There are duplicate combinatins of UserID and BookID.
I am trying to retrieve distinct records by UserID and BookID that have the most recent BookDate. That's easy enough (below), but I also need to retrieve the ID and SellerID columns for the returned record. That's where I'm having trouble...
Select Distinct
UserID, CourseID, MAX(AssignedON)
From
AssignmentS
Group By
UserID, CourseID
Every time I add a join I get all records. I've tried rowover, exists and nothing seems to work. Any help would be greatly appreciated!

select userid,courseid,bookdate,sellerid from
(select userid,courseid,bookdate,sellerid,
row_number() over (partition by userid,courseid
order by bookdate desc) as RNUM
from yourtable where yourwhere)
where rnum = 1;

[This][1]
[1]: http://coding.feron.it/2012/08/mssql-having-maxid-id-problem-row_number-partition/ blog post describes in detail how to do this with multiple tables

Figured it out. I just had to move things around a bit and this is working perfectly!
select userid,courseid,bookdate,sellerid from (
select * row_number() over(partition by userid,courseid, order by bookdate desc) as RNUM
from yourtable where yourwhere)
where rnum = 1;

Related

Create a temporary table showing the most eventful country for each year in SQL Server

I have an exercise in SQL Server: I have two tables Country and Events.
The Events table holds the event details including the city where an event happens. The table Events has a foreign key CountryID (CountryID is the primary key in table Country).
I need to create a temporary table showing the most eventful country for each year.
Any help would be appreciated
Thanks
You weren't far off with your attempt, but you need to use a CTE to aggregate your data first. I've assumed that the final order of your data is important, so I used a second CTE, rather than a TOP 1 WITH TIES tio get the final result:
WITH CTE AS(
SELECT YEAR(e.EventDate) AS YearOfEvent,
c.CountryName,
COUNT(e.CountryID) AS NumberOfEvents
FROM [dbo].[tblEvent] AS e
INNER JOIN tblCountry AS c ON e.CountryID = c.CountryID
GROUP BY e.CountryId,
c.CountryName,
YEAR(e.EventDate)),
RNs AS(
SELECT YearOfEvent,
CountryName,
NumberOfEvents,
ROW_NUMBER() OVER (PARTITION BY YearOfEvent ORDER BY CTE.NumberOfEvents DESC) AS RN
FROM CTE)
SELECT YearOfEvent,
CountryName,
NumberOfEvents
FROM RNs
WHERE RN = 1
ORDER BY RNs.YearOfEvent ASC;

StackExchange Query Help t-sql

Would anybody be able to help me with this exercise. I am used to querying on postgresql and not t-sql and I am running into trouble with how some of my data aggregates
My assignment requires me to:
Create a query that returns the number of comments made on each day for each post from the top 50 most commented on posts in the past year.
For example, this query below is giving me a non aggregated result set:
select cast(creationdate as date),
postid,
count(id)
from comments
where postid = 17654496
group by creationdate, postid
The schema is all here
https://data.stackexchange.com/stackoverflow/query/edit/898297
You can try to use CTE get the count by date.
then use window function with ROW_NUMBER make row number order by count amount desc.
;with CTE as (
select cast(creationdate as date) dt,
postid,
count(id) cnt
from comments
WHERE creationdate between dateadd(year,-1,getdate()) and getdate()
group by cast(creationdate as date), postid
), CTE2 AS (
select *,ROW_NUMBER() OVER (order by cnt desc) rn
from CTE
)
SELECT *
FROM CTE2
WHERE rn <=50
https://data.stackexchange.com/stackoverflow/query/898322/test

Get the id of the row with the max value with two grouping

We have a data structure with four columns:
ContractoreName, ProjectCode, InvoiceID, OrderID
We want to group the data by both ContractoreName and ProjectCode columns, and then get the InvoiceID of the row for each group with MAX(OrderID).
You could use ROW_NUMBER:
SELECT ContractorName, ProjectName, OrderId, InvoiceId
FROM (SELECT *, ROW_NUMBER() OVER(PARTITION BY ContractorName, ProjectName
ORDER BY OrderId DESC) AS rn
FROM tab
) AS sub
WHERE rn = 1;
ROW_NUMBER() is what I would call the canonical solution. In many cases, an old-fashioned solution has better performance:
select t.*
from t
where t.orderid = (select max(t2.orderid)
from t t2
where t2.contractorname = t.contractorname and
t2.projectname = t.projectname
);
This is especially true if there is an index on (contractorname, projectname, orderid).
Why is this faster? Basically, SQL Server can scan the table doing a lookup in an index. The lookup is really fast because the index is designed for it, so the scan is just a little faster than a full table scan.
When using row_number(), SQL Server has to scan the table to calculate the row number (and that can use the index, so it might be fast). But then it has to go back to the table to fetch the columns and apply the where clause. So, even if it uses an index, it is doing more work.
EDIT:
I should also point out that this can be done without a subquery:
select distinct contractorname, projectname,
max(orderid) over (partition by contractorname, projectname) as lastest_order,
first_value(invoiceid) partition by (order by contractorname, projectname order by orderid desc) as lastest_invoice
from t;
Unfortunately, SQL Server doesn't offer first_value() as an aggregation function, but you can use select distinct and get the same effect.

Unique vs MAX in SQL statement

I have a table with three columns:
PERSON
VISITOR
DATE
The table is basically a transactional table. The following is true:
There are multiple rows per person
There are multiple rows per visitor
There are multiple rows of a given person/visitor combination.
Assumed unique person/date combination
What I need is
I want visitor for each Person's MAX Date.
I cannot have multiple persons in the output.
Person must be unique.
visitor may repeat.
I have tried:
SELECT
ROW_NUMBER() OVER (PARTITION BY PERSON, VISITOR ORDER BY Date DESC) row_num,
PERSON,
VISITOR as VISITOR
FROM
`TABLE`
ORDER BY
PERSON
Maybe this... not sure I fully understand question. Sample data /expected results would help.
You said you wanted only the 1 person with the visitor per max date so the row_num of 1 will be the record w/ the max date. and since we partition by person it will not matter if person A had 3 visitors. only the person and their Most recent visitor will be listed.
WITH cte as (
SELECT ROW_NUMBER() OVER (PARTITION BY PERSON ORDER BY Date DESC) row_num
, PERSON
, VISITOR as VISITOR
FROM `TABLE`)
SELECT *
FROM cte
WHERE row_Num = 1
I think this can be done with a cross apply too though i'm not as good at using them yet...
SELECT A.Person, A.Visitor, A.Date
FROM table A
CROSS APPLY (SELECT TOP 1 *
FROM TABLE B
WHERE A.Person = B.Person
and A.Visitor = B.Visitor
and A.Date = B.Date
ORDER BY DATE DESC) C
Essentially the inner query runs for each record on the outer query; thus only the top most record will be returned thus the newest date.
select a.* from myTable as a inner join (
SELECT person, max(date) as maxDate from myTable group by person
) as b
on a.date = b.maxDate
and a.person = b.person;
I am weak in reading and writing English.
In my opinion the answer may be:
SELECT `PERSON`, `VISITOR`, MAX(`DATE`) AS `DATE`
FROM `TABLE`
GROUP BY `PERSON`, `VISITOR`;

Selecting first record for each FK provided in where clause

To make things a bit simpler I have created a hypothetical scenario around my actual problem. Take the following schema for example:
Let's say I am wanting to do all of the following in order, in one query
Order all audit records by DateUpdated DESC
SELECT only the first record (so as to get the most recent audit)
(Here is the catch) select the first record for each ClientId supplied
This is pretty simple if I were trying to get the first record for a single client. Then it would simply be something like:
SELECT TOP 1 Id, ClientId, Data, DateUpdated
FROM AuditRecord
WHERE ClientId = xxx
ORDER BY DateUpdated DESC
However, my where clause is actually going to be an IN statement
SELECT TOP 1 Id, ClientId, Data, DateUpdated
FROM AuditRecord
WHERE ClientId IN (a,b,c,d,e,f,g,h,i,j,k,l)
ORDER BY DateUpdated DESC
How can I select the first (most recent) record for each client id supplied in my IN clause without using a loop?
As #APH pointed out, you can make used of window functions, but this is a perfect application for a cross/outer apply operation as well:
SELECT t.Id, t.ClientId, t.Data, t.DateUpdated
FROM AuditRecord t
cross apply (
select top 1 c.Id
from AuditRecord c
where c.ClientId = t.ClientId
order by c.DateUpdated desc
)
WHERE t.ClientId IN (a,b,c,d,e,f,g,h,i,j,k,l)
You'd have to test in your environment to see which makes more sense/is more efficient, but in most scenarios, an APPLY operation will outperform a window function used for this same scenario.
Here's an example using a partition and subquery. Row_number will assign sequential integers to each row, starting with the highest DateUpdated, and starting over at 1 for each new ClientID. You could also do this with a CTE or temp table if you prefer.
Select Id, ClientId, Data, DateUpdated from
(SELECT Id, ClientId, Data, DateUpdated
, row_number() over (partition by ClientID Order by DateUpdated Desc) as RN
FROM AuditRecord WHERE ClientId IN (a,b,c,d,e,f,g,h,i,j,k,l)) a
where RN = 1
Something like SELECT MAX(DateUpdated), ClientId from AuditRecord Group by ClientID will give you the latest DateUpdated for each ClientID.
This gives you every client record with it's most recent audit data field (or nulls if none exist).
select *
from Client
outer apply
(
select top 1 AuditRecord.*
from AuditRecord
where AuditRecord.ClientID = Client.ID
order by DateUpdated desc
) TopAuditRecord

Resources