Unique vs MAX in SQL statement - sql-server

I have a table with three columns:
PERSON
VISITOR
DATE
The table is basically a transactional table. The following is true:
There are multiple rows per person
There are multiple rows per visitor
There are multiple rows of a given person/visitor combination.
Assumed unique person/date combination
What I need is
I want visitor for each Person's MAX Date.
I cannot have multiple persons in the output.
Person must be unique.
visitor may repeat.
I have tried:
SELECT
ROW_NUMBER() OVER (PARTITION BY PERSON, VISITOR ORDER BY Date DESC) row_num,
PERSON,
VISITOR as VISITOR
FROM
`TABLE`
ORDER BY
PERSON

Maybe this... not sure I fully understand question. Sample data /expected results would help.
You said you wanted only the 1 person with the visitor per max date so the row_num of 1 will be the record w/ the max date. and since we partition by person it will not matter if person A had 3 visitors. only the person and their Most recent visitor will be listed.
WITH cte as (
SELECT ROW_NUMBER() OVER (PARTITION BY PERSON ORDER BY Date DESC) row_num
, PERSON
, VISITOR as VISITOR
FROM `TABLE`)
SELECT *
FROM cte
WHERE row_Num = 1
I think this can be done with a cross apply too though i'm not as good at using them yet...
SELECT A.Person, A.Visitor, A.Date
FROM table A
CROSS APPLY (SELECT TOP 1 *
FROM TABLE B
WHERE A.Person = B.Person
and A.Visitor = B.Visitor
and A.Date = B.Date
ORDER BY DATE DESC) C
Essentially the inner query runs for each record on the outer query; thus only the top most record will be returned thus the newest date.

select a.* from myTable as a inner join (
SELECT person, max(date) as maxDate from myTable group by person
) as b
on a.date = b.maxDate
and a.person = b.person;

I am weak in reading and writing English.
In my opinion the answer may be:
SELECT `PERSON`, `VISITOR`, MAX(`DATE`) AS `DATE`
FROM `TABLE`
GROUP BY `PERSON`, `VISITOR`;

Related

SQL Project using a where clause

So this is what I am working with new to sql and still learning been stuck on this for a few days now. Any advice would be appreciated I attached the image of the goal I'm trying to achieve
OrderItem And Product Table
Order And OrderItem Table(https://i.stack.imgur.com/pdbMT.png)
Scenario: Our boss would like to see the OrderNumber, OrderDate, Product Name, UnitPrice and Quantity for products that have TotalAmounts larger than the average
Create a query with a subquery in the WHERE clause. OrderNumber, OrderDate and TotalAmount come from the Order table. ProductName comes from the Product table. UnitPrice and Quantity come from the OrderItem table.
This is the code I came up with but it causes product name to run endlessly and displays wrong info.
USE TestCorp;
SELECT DISTINCT OrderNumber,
OrderDate,
ProductName,
i.UnitPrice,
Quantity,
TotalAmount
FROM [Order], Product
JOIN OrderItem i ON Product.UnitPrice = i.UnitPrice
WHERE TotalAmount < ( SELECT AVG(TotalAmount)
FROM [Order]
)
ORDER BY TotalAmount DESC;
Best guess assuming joins and fields not provided.
SELECT O.OrderNumber, O.orderDate, P.ProductName, OI.UnitPrice, OI.Quantity, O.TotalAmount
FROM [Order] O
INNER JOIN OrderItem OI
on O.ID = OI.orderID
INNER JOIN Product P
on P.ID= OI.ProductID
CROSS JOIN (SELECT avg(TotalAmount) AvgTotalAmount FROM [Order]) z
WHERE O.TotalAmount > z.AvgTotalAmount
Notes:
You're mixing join notations don't use , and inner join together that's mixing something called ANSI Standards.
I'm not sure why you have a cross join to product to begin with
You don't specify how to join Order to order item.
It seems very odd to be joining on Price.... join on order ID or productID maybe?
you could cross join to an "Average" result so it's available on every record. (I aliased this inline view "Z" in my attempt)
so what the above does is include all Orders. and for each order, an order item must be associated for it to be included. And then for each order item, a productid must be included and related to a record in product. If for some reason an order item record doens't have a related entry in product table, it gets excluded.
I use a cross join to get the average as it's executed 1 time and applied/joined to every record.
If we use the query in the where clause it's executed one time for EVERY record (unless the DB Engine optimizer figures it out and generates a better plan)
I Assume
Order.ID relates to OrderItem.OrderID
OrderItem.productID relates to Product.ID
Order.TotalAmount is what we are wanting to "Average" and compare against
Every Order has an Order Item entry
Every Order Item entry has a related product.

Why is Rank() OVER PARTITION BY returning too many results

I want the results of my query to be the top 3 newest, distinct Campaign Names for each Campaign Type.
My query at the moment is:
DECLARE #currentRecord varchar(160);
SET #currentRecord = '316827D2-B522-E811-816A-0050569FE3BD';
SELECT DISTINCT
rs.CampaignName,
rs.CampaignType,
rs.receivedon,
rs.Rank
FROM
(SELECT
fs_retentioncontact,
receivedon,
regardingobjectidname AS CampaignName,
fs_campaignresponsetypename AS CampaignType,
RANK() OVER (PARTITION BY fs_campaignresponsetypename, regardingobjectidname
ORDER BY receivedon DESC) AS Rank
FROM
dbo.FilteredCampaignResponse) rs
INNER JOIN
dbo.FilteredContact ON rs.fs_retentioncontact = dbo.FilteredContact.contactid
WHERE
(dbo.FilteredContact.parentcustomerid IN (#currentRecord))
AND Rank <= 3
ORDER BY
CampaignType, receivedon DESC;
There may be multiple results for each campaign name as well as campaign response because they are linked to individual contacts but I only want to see the 3 latest unique campaigns for each campaign type.
My query is not partitioning by each individual campaign response type (there are 6 different ones) as I was expecting. If I remove the regardingobjectidname from the PARTITION BY I only get a single row in the results when I should be getting 18 rows. This particular company has over 700 campaign responses across the 6 campaign types.
My query is returning 102 rows so it seems to be removing duplicates on campaign name which is part of what I need but not the whole story.
I have read quite a few posts regarding rank() on here e.g.
how-to-use-rank-in-sql-server
[ using-sql-rank-for-overall-rank-and-rank-within-a-group]2
but I am not able to work out what I am doing wrong from their examples. Could it be the positioning of the 'receivedon' in the ORDER BY? or something else?
I have finally worked out from reading a post on another site how to get the top 3 of each group. I shall post my answer in case it helps anyone else.
I had to use ROW_NUMBER() OVER (PARTITION BY instead of RANK() OVER (PARTITION BY and I also moved the INNER JOIN and WHERE clause (to filter for the correct company) from the outer query to the inner query.
DECLARE #currentRecord varchar(160)
SET #currentRecord='316827D2-B522-E811-816A-0050569FE3BD'
SELECT distinct rs.CampaignName
,rs.CampaignType
, rs.receivedon
,RowNum
FROM(
SELECT fs_retentioncontact
, receivedon
, regardingobjectidname AS CampaignName
,fs_campaignresponsetypename as CampaignType
,ROW_NUMBER() OVER (PARTITION BY fs_campaignresponsetypename ORDER BY fs_campaignresponsetypename, receivedon DESC) AS RowNum
FROM FilteredCampaignResponse
INNER JOIN dbo.FilteredContact ON fs_retentioncontact = dbo.FilteredContact.contactid
WHERE(dbo.FilteredContact.parentcustomerid IN (#currentRecord)))rs
WHERE RowNum <=3
ORDER BY CampaignType,receivedon DESC;

Create a temporary table showing the most eventful country for each year in SQL Server

I have an exercise in SQL Server: I have two tables Country and Events.
The Events table holds the event details including the city where an event happens. The table Events has a foreign key CountryID (CountryID is the primary key in table Country).
I need to create a temporary table showing the most eventful country for each year.
Any help would be appreciated
Thanks
You weren't far off with your attempt, but you need to use a CTE to aggregate your data first. I've assumed that the final order of your data is important, so I used a second CTE, rather than a TOP 1 WITH TIES tio get the final result:
WITH CTE AS(
SELECT YEAR(e.EventDate) AS YearOfEvent,
c.CountryName,
COUNT(e.CountryID) AS NumberOfEvents
FROM [dbo].[tblEvent] AS e
INNER JOIN tblCountry AS c ON e.CountryID = c.CountryID
GROUP BY e.CountryId,
c.CountryName,
YEAR(e.EventDate)),
RNs AS(
SELECT YearOfEvent,
CountryName,
NumberOfEvents,
ROW_NUMBER() OVER (PARTITION BY YearOfEvent ORDER BY CTE.NumberOfEvents DESC) AS RN
FROM CTE)
SELECT YearOfEvent,
CountryName,
NumberOfEvents
FROM RNs
WHERE RN = 1
ORDER BY RNs.YearOfEvent ASC;

T-SQL: GROUP BY, but while keeping a non-grouped column (or re-joining it)?

I'm on SQL Server 2008, and having trouble querying an audit table the way I want to.
The table shows every time a new ID comes in, as well as every time an IDs Type changes
Record # ID Type Date
1 ae08k M 2017-01-02:12:03
2 liei0 A 2017-01-02:12:04
3 ae08k C 2017-01-02:13:05
4 we808 A 2017-01-03:20:05
I'd kinda like to produce a snapshot of the status for each ID, at a certain date. My thought was something like this:
SELECT
ID
,max(date) AS Max
FROM
Table
WHERE
Date < 'whatever-my-cutoff-date-is-here'
GROUP BY
ID
But that loses the Type column. If I add in the type column to my GROUP BY, then I'd get get duplicate rows per ID naturally, for all the types it had before the date.
So I was thinking of running a second version of the table (via a common table expression), and left joining that in to get the Type.
On my query above, all I have to join to are the ID & Date. Somehow if the dates are too close together, I end up with duplicate results (like say above, ae08k would show up once for each Type). That or I'm just super confused.
Basically all I ever do in SQL are left joins, group bys, and common table expressions (to then left join). What am I missing that I'd need in this situation...?
Use row_number()
select *
from ( select *
, row_number() over (partition by id order by date desc) as rn
from table
WHERE Date < 'whatever-my-cutoff-date-is-here'
) tt
where tt.rn = 1
I'd kinda like know how many IDs are of each type, at a certain date.
Well, for that you use COUNT and GROUP BY on Type:
SELECT Type, COUNT(ID)
FROM Table
WHERE Date < 'whatever-your-cutoff-date-is-here'
GROUP BY Type
Basing on your comment under Zohar Peled answer you probably looking for something like this:
; with cte as (select distinct ID from Table where Date < '$param')
select [data].*, [data2].[count]
from cte
cross apply
( select top 1 *
from Table
where Table.ID = cte.ID
and Table.Date < '$param'
order by Table.Date desc
) as [data]
cross apply
( select count(1) as [count]
from Table
where Table.ID = cte.ID
and Table.Date < '$param'
) as [data2]

One to many join to last modified record in the most efficient way

I realise that variations of this question have been asked before but I'd like to know the most efficient solution to my particular issue.
I have two tables...
Event (event_id, customer_email...)
Customer (customer_email, last_modified...)
I'm joining these two tables and only want the customer with the greatest last_modified date. The customer table is absolutely huge so was wondering the best way to go about this.
Setting aside indices, this is the query you could use:
SELECT <columns you want>
FROM Event AS E
JOIN Customer AS C
ON C.Customer_Email = E.Customer_Email
JOIN ( SELECT C1.Customer_Email, MAX(C1.Last_Modified) AS LastModified
FROM Customer AS C1
GROUP BY C1.Customer_Email
) AS C2
ON C2.Customer_Email = C.Customer_Email
AND C2.LastModified = C.Last_Modified
Use row_number
select *
from
(
select *, Row_number() over (partition by Event_ID order by Last_Modified desc) rn
from Event
inner join Customer
on Event.Customer_Email = Customer.Customer_Email
) v
where rn = 1

Resources