This question already has answers here:
Get top 1 row of each group
(19 answers)
Closed last year.
I have 2 tables like above
table1 = items => id, name
table 2 = item_price => item_id, date,price
I need to
select items.name from items
INNER JOIN item_price ON items.id = item_price2.item_id
where item_price.date is last date
items
id name
1 item name 1
2 item name 2
3 item name 3
item_price
item_id date price
1 2022-01-13 5
2 2022-01-10 7
1 2022-01-10 4
1 2022-01-09 9
I need build query to get this result
item name price
item name 1 5
item name 2 7
thanks a lot
This can be solved by selecting the last row using the ROW_NUMBER windowed partition function.
WITH last_item_price (item_id, price, reverse_order)
AS
(SELECT item_id
, price
, ROW_NUMBER() OVER
(PARTITION BY item_id
ORDER BY date DESC) AS reverse_order
FROM item_price)
SELECT i.name
, p.price
FROM items i
JOIN last_item_price p
ON i.item_id = p.item_id
AND p.reverse_order = 1;
Unless the primary key of item_price is defined as the composite key for item_id and date, there is a risk you will have more than one possible price. For instance, if the price changes three times on a single day, then how do you know which is the correct value? I would recommend using a DATETIME or DATETIME2 field to help pick the correct price in that scenario. Alternatively, you can define the primary key as the composite of the two fields and only allow a single price each day.
Another way to avoid the duplication issue is to add an auto-incrementing identity column. This will not be useful for joins and I still strongly recommend using a composite key for the item_id and date fields, but it is a valid alternative. In that case, you can modify the partition function's ORDER BY clause to:
ORDER BY date DESC, item_price_id DESC
In general, I would not recommend naming a field date. Since SQL Server has a data type called DATE, this can cause issues with linters. In some cases, you may be required to use brackets so the field is [date] to disambiguate your intent. A better name is probably price_date or price_change_date.
Related
Let's imagine a table below, where;
ID is the primary key and it is auto incremental column
ItemType is a foreign key
OrderID is the order number for each ItemType value
ID ItemType OrderID Col1
== ======== ======= ====
1 1 1 ABCD
2 1 2 XYZT
3 2 1 BDKL
4 1 3 XXXX
5 1 4 TYTY
6 2 2 ABCD
7 1 5 XYZZ
8 3 1 ABCD
9 3 2 ABCD
10 1 6 XYZT
11 2 3 ABCD
as you see there might be more than one ItemType that comes from another table, and each ItemType has a sequential OrderID that starts from 1 and increases by 1 for every record.
My Question is;
what is the best practice to have a column that keeps the OrderID information correctly?
Assuming that the ID values would always be increasing, such that a subsequent order's ID value would always be greater than an an earlier order's ID value, we could just use ROW_NUMBER here and not even use a column:
SELECT
ID,
ItemType,
ROW_NUMBER() OVER (PARTITION BY ItemType ORDER BY ID) OrderID,
Col1
FROM yourTable
ORDER BY
ID;
Demo
If my assumption of the ID column might not be correct always, then I suggest adding a new timestamp column which records when each order actually happened. Then, use something similar to the above approach, but order based on the order timestamp.
You do not need to do this - it will be difficult to implement and you can face some performance issues if batches of orders are created at the same time. As there is no built -in group by identity or identity over (partition by) you need to get the maximum value for each inserted type - and this should be in transaction and will be blocking others inserted.
So, just have a normal identity column to guarantee uniqueness of each order and use ROW_NUMBER to get the OrderID in incremented way by type in the presentation lair.
Please suggest an SQL query to find duplicate customers across different stores, e.g. customer table has id, name, phone, storeid in it, I need to write queries for the following:
Duplicate customers within a store
Duplicate customers across different stores
Table data:
id name phone storeid
-----------------------------------
1 abc 123 4
2 abc 123 4
3 abc 123 5
The first query should show only first 2 records, and the second query should show all 3 records.
You can do something like the following:-
SELECT Name,Phone, COUNT(Id) NumberOfTimes, StoreID
FROM Customers
GROUP BY Name,Phone,StoreID
HAVING COUNT(Id) > 1
ORDER BY StoreID
Hope this helps.
Solution
You can try this for the first query:
SELECT *
FROM customer,
WHERE 1 < (
SELECT COUNT(name)
FROM customer
WHERE name IN (
SELECT name FROM customer
)
) AND
1 < (
SELECT COUNT(storeid)
FROM customer
WHERE storeid IN (
SELECT storeid FROM customer
)
);
Now, for the second query, use the above one, but remove everything after and including the AND.
Explanation
Let's look at the query step-by-step:
SELECT *
FROM customer
This is stating you want all the columns from the customers table.
WHERE 1 < (
SELECT COUNT(name)
FROM customer
WHERE name IN (
SELECT name FROM customer
)
)
This is a pretty long query, so let's look from inside-outward.
WHERE name IN (
SELECT name FROM customer
)
This time we're getting all the names of customers and checking if their is match in our curret table. To be truthful, we might not need this whole section....
SELECT COUNT(name)
FROM customer
This is stating we want the total number of times each name appears (count) in the customers table that matches the where clause.
WHERE 1 < (
....
)
Here, we are comparing the result from the subquery (the number of duplicated names) and checking to see if it is greater than l (i.e., there is a duplicate).
AND
.....
The AND keyword indicates that this second condition must be true in addition to the previous conditions.
The full query should return all entries where both the names and store ids are duplicated; if you remove everything including and after the AND, that will result in all entries which have the same name, but not neccessarily the right store id.
Notes
The other two answers are suggesting grouping duplicated data, but in your particular case, I think you do want the duplicated entries as per your expected results (albeit you should add more expected output info than that).
SELECT storeName, customerName FROM customer
WHERE id IN (
SELECT c.storeid
FROM customer 'c'
RIGHT JOIN store 's' ON (c.storeid = s.id)
GROUP BY c.storeid
HAVING COUNT(*) > 1
)
Basically, we are grouping by storeids, which allows us to count the times they occur in the customer table. We get the id of a case where there are multiple occurrences, and we select the storeName and CustomerName from the customer table that contains the id we got from the inner query.
Apologies for goofy title. I am not sure how to describe the problem.
I have a table in SQL Server with this structure;
ID varchar(15)
ProdDate datetime
Value double
For each ID there can be hundreds of rows, each with its own ProdDate. ID and ProdDate form the unique key for the table.
What I need to do is find the maximum Value for each ID based upon the first 12 samples, ordered by ProdDate ascending.
Said another way. For each ID I need to find the 12 earliest dates for that ID (the sampling for each ID will start at different dates) and then find the maximum Value for those 12 samples.
Any idea of how to do this without multiple queries and temporary tables?
You can use a common table expression and ROW_NUMBER to logically define the TOP 12 per Id then MAX ... GROUP BY on that.
;WITH T
AS (SELECT *,
ROW_NUMBER() OVER (PARTITION BY Id ORDER BY ProdDate) AS RN
FROM YourTable)
SELECT Id,
MAX(Value) AS Value
FROM T
WHERE RN <= 12
GROUP BY Id
I have three tables
Table 1: Items
ItemID | DaysLastSold
Table2: Listings
ItemID | ListingID
Table3: Sales
ListingID | DateItemClosed
I got this query to work:
SELECT min(DATEDIFF(day, DateItemClosed, getdate())) as DaysLastSold
from Sales
where QtySold > 0
and ListingID in (SELECT ListingID from Listings where ItemID = 8101 )
What I'm trying to do is basically place this query into the DaysLastSold Column in the Items table. So when ever the column is selected it recalculates DaysLastSold using the ItemID in the neighboring column.
If you want to persist that information you could create an indexed view that is made up of your calculated value and an ItemID. Obviously this would not be a column in your original table though. You could then join in on this view when you need the information.
Personally I would probably just do it inline when you need it. If you are concerned about performance, post the execution plan here and we may be able to make some suggestions.
If I have a table of data like this
tableid author book pubdate
1 1 The Hobbit 1923
2 1 Fellowship 1925
3 2 Foundation Trilogy 1947
4 2 I Robot 1942
5 3 Frankenstein 1889
6 3 Frankenstein 2 1894
Is there a query that would get me the following without having to use a temp table, table variable or cte?
tableid author book pubdate
1 1 The Hobbit 1923
4 2 I Robot 1942
5 3 Frankenstein 1889
So I want min(ranking) grouping by person and ending up with book for that min(ranking) value.
OK, the data I gave initially was flawed. Instead of a ranking column I'll have a date column. I need the book published earliest by author.
Missed that a CTE was not valid (but not sure why). How about as a subquery?
SELECT tableid, author, book, pubdate
FROM
(
SELECT
tableid, author, book, pubdate,
rn = ROW_NUMBER() OVER
(
PARTITION BY author
ORDER BY pubdate
)
FROM dbo.src -- replace this with the real table name
) AS x
WHERE rn = 1
ORDER BY tableid;
Original:
;WITH x AS
(
SELECT
tableid, author, book, pubdate,
rn = ROW_NUMBER() OVER
(
PARTITION BY author
ORDER BY pubdate
)
FROM dbo.src -- replace this with the real table name
)
SELECT tableid, author, book, pubdate
FROM x
WHERE rn = 1
ORDER BY tableid;
If you want to return multiple rows when there is a tie for earliest book, use RANK() in place of ROW_NUMBER(). In the case of a tie and you only want to return one row, you need to add additional tie breaker columns to the ORDER BY within OVER().
select * from table where ranking = 1
EDIT
Are you looking for this query to work in situations where there is no value of rank=1 for a given table and person? in that case, try this:
select *, RANK() OVER (Partition By talbeid, personid order by rank asc) as sqlrank
from table
where sqlrank = 1
EDIT OF MY EDIT:
This will work for the earliest pub date:
select *, RANK() OVER (Partition By author order by pubdate asc) as sqlrank
from table
where sqlrank = 1
SELECT tableid,author,book,pubdate FROM my_table as my_table1 WHERE pubdate =
(SELECT MIN(pubdate) FROM my_table as my_table2 WHERE my_table1.author = my_table2.author);
WITH min_table as
(
SELECT author, min(pubdate) as min_pubdate
FROM table
GROUP BY author
)
SELECT t.tableid, t.author, t.book, t.pubdate
FROM table t INNER JOIN min_table mt on t.author = mt.author and t.pub_date = mt.min_pubdate
Your sample data may be a overly simplistic. You talk about 'min(ranking)', but for all your examples, the minimum ranking for each personid is 1. So the answers you have received so far short-circuit the issue and simple select for ranking = 1. You don't state it in your "requirements", but it sounds like the minimum rank value for any particular personid may not necessarily be 1, correct? Also, you don't mention if a person can rank two or more books with the same minimum rank, so answers will be incomplete due to this missing requirement.
If my psychic abilities are accurate, then you might want to try something like this (untested obviously):
SELECT tableid, personid, book, ranking
FROM UnknownTable UNKTBL INNER JOIN
(SELECT personid, min(ranking) as ranking
FROM UnknownTable GROUP BY personid) MINRANK
ON UNKTBL.personid = MINRANK.personid AND UNKTBL.ranking = MINRANK.ranking
This will return all the rows for each person where the ranking value is the minimum value for that person. So if the minimum ranking for person 6 is 2, and there are two books for that person with that ranking, then both book rows will be returned.
If these are not, in fact your requirements, then please edit your question with more details/example data. Thanks!
Edit
Based on your change in requirements/example data, the SQL above should still work, if you change the column names appropriately. You still don't mention if an author can have two books in the same year (i.e. a prolific author such as Stephen King), so the SQL I have here will give multiple rows if the same author publishes two books in the same year, and that year is the earliest year of publication for that author.
SELECT * FROM my_table WHERE ranking = 1
ZING!
Seriously though I don't follow your question - can you provide a more elaborate or complicated example? I think I'm missing something obvious.