SQL Server problems when using ODER BY and DISTINCT together - sql-server

I got two problems, the first problem is my two COUNTS that I start with. GroupID is a string that keep products together (Name_Year together), same product but different size.
If I have three reviews in tblReview and they all have the same GroupID I want to return 3. My problem is that if I have three Products with different ProductID but same GroupID and I add three Review to that GroupID I got 9 returns (3*3). If I only have one Product With the same GroupID and three Reviews it works (1*3=3 returns)
The Second problem is that if I have the ORDER BY CASE Price I have to add GROUP BY Price as well and then I don't get the DISTINCT effect that I want. And that is to just show products that have unique GroupID.
Here's the query, hope somebody can help me with this.
ALTER PROCEDURE GetFilterdProducts
#CategoryID INT, #ColumnName varchar(100)
AS
SELECT COUNT(tblReview.GroupID) AS ReviewCount,
COUNT(tblComment.GroupID) AS CommentCount,
Product.ProductID,
Product.Name,
Product.Year,
Product.Price,
Product.BrandID,
Product.GroupID,
AVG(tblReview.Grade) AS Grade
FROM Product LEFT JOIN
tblComment ON Product.GroupID = tblComment.GroupID LEFT JOIN
tblReview ON Product.GroupID = tblReview.GroupID
WHERE (Product.CategoryID = #CategoryID)
GROUP BY Product.ProductID, Product.BrandID, Product.GroupID, Product.Name, Product.Year, Product.Price
HAVING COUNT(distinct Product.GroupID) = 1
ORDER BY
CASE
WHEN #ColumnName='Name' THEN Name
WHEN #ColumnName='Year' THEN Year
WHEN #ColumnName='Price' THEN Price
END
My tabels:
Product:
ProductID, Name, Year, Price, BrandID, GroupID
tblReview:
ReviewID, Description, Grade, ProductID, GroupID
tblComment:
CommentID, Description, ProductID, GroupID
I think that my problem is that if I have three GroupID with the same name, ex Nike_2010 in Product and I have three Reviews in tblReview that counts the first row in Products that contain Nike_2010 counts how many reviews in tblReview with the same GroupID, Nike_2010 and then the second row in Product that contains Nike_2010 and then do the same count again and again, that results to 9 rows. How do I avoid that?

For starters, because you're joining on multiple tables, you're going to end up with the cross product of all of them as a result. Your counts will then return the total count of rows containing data in that column. Consider the following example:
- PRODUCTS - -- COMMENTS -- --- REVIEWS ---
Key | Name Key | Comment Key | Review
1 | A 1 | Foo 1 | Great
2 | B 1 | Bar 1 | Wonderful
The query
SELECT PRODUCTS.Key, PRODUCTS.Name, COMMENTS.Comment, REVIEWS.Review
FROM PRODUCTS
LEFT OUTER JOIN COMMENTS ON PRODUCTS.KEY = COMMENTS.KEY
LEFT OUTER JOIN REVIEWS ON PRODUCTS.KEY = REVIEWS.KEY
will result in the following data:
Key | Name | Comment | Review
1 | A | Foo | Great
1 | A | Foo | Wonderful
1 | A | Bar | Great
1 | A | Bar | Wonderful
2 | B | NULL | NULL
Thus, counting in this format
SELECT PRODUCTS.Key, PRODUCTS.Name, COUNT(COMMENTS.Comment), COUNT(REVIEWS.Review)
FROM PRODUCTS
LEFT OUTER JOIN COMMENTS ON PRODUCTS.KEY = COMMENTS.KEY
LEFT OUTER JOIN REVIEWS ON PRODUCTS.KEY = REVIEWS.KEY
GROUP BY PRODUCTS.Key, PRODUCTS.Name
will give you
Key | Name | Count1 | Count2
1 | A | 4 | 4
2 | B | 0 | 0
because it's counting each row in the table produced by the join!
Instead, you want to count each table separately in a subquery before joining it back like the following:
SELECT PRODUCTS.Key, PRODUCTS.Name, ISNULL(CommentCount.NumComments, 0),
ISNULL(ReviewCount.NumReviews, 0)
FROM PRODUCTS
LEFT OUTER JOIN (SELECT Key, COUNT(*) as NumComments
FROM COMMENTS
GROUP BY Key) CommentCount on PRODUCTS.Key = CommentCount.Key
LEFT OUTER JOIN (SELECT Key, COUNT(*) as NumReviews
FROM REVIEWS
GROUP BY Key) ReviewCount on PRODUCTS.Key = ReviewCount.Key
which will produce the following
Key | Name | NumComments | NumReviews
1 | A | 2 | 2
2 | B | 0 | 0
As for the "DISTINCT effect" you refer to, I'm not exactly sure I follow. Could you elaborate a bit?

About second problem - cannot you group by same CASE statement? You shouldn't have Price field in results list then though.

Related

Join with a max and nulls

I have 2 tables:
People:
ID | Name
----------
1 | John
2 | David
3 | Jennifer
another which is has a simple FK to the first
Note:
ID | People_ID | Note
----------------------
1 | 1 | A note
2 | 1 | Another note
3 | 3 | Jen's note
I want to get the note associated with the max(ID) from Note for each person, or a null if no notes, so the desired result is:
People_ID | Name | Note
----------------------------
1 |John | Another Note
2 |David | NULL
3 |Jennifer| Jen's Note
I can perform a join, but can't include David because the max criteria doesn't bring back the null column. Any help please?
That's a left join - and I would recommend pre-aggregating the notes in a subquery:
select p.*, n.*
from people p
left join (
select people_id, max(id) max_note_id
from note
group by people_id
) n on n.people_id = p.id
There are situations where a lateral join would be more efficient:
select p.*, n.*
from people p
outer apply (
select top(1) id max_note_id
from note n
where n.people_id = p.id
order by id desc
) n
The nice thing about the lateral join is that you can easily bring more columns from the top matching record in the note table if you want to (like the text of the note, or else).
You can use below query:
Demo
SELECT A.NAME, A.ID, MAX(B.ID) FROM PEOPLE A LEFT OUTER JOIN NOTE B
ON (A.ID = B.PEOPLE_ID) GROUP BY A.NAME, A.ID;

Joining two tables and need to have MAX aggregate function in ON clause

This is my code! I want to give a part id and purchase order id to my report and it brings all the related information with those specification. The important thing is that, if we have same purchase order id and part id we need the code to return the result with the highest transaction id. The following code is not providing what I expected. Could you please help me?
SELECT MAX(INVENTORY_TRANS.TRANSACTION_ID), INVENTORY_TRANS.PART_ID
, INVENTORY_TRANS.PURC_ORDER_ID, TRACE_INV_TRANS.QTY, TRACE_INV_TRANS.CREATE_DATE, TRACE_INV_TRANS.TRACE_ID
FROM INVENTORY_TRANS
JOIN TRACE_INV_TRANS ON INVENTORY_TRANS.TRANSACTION_ID = TRACE_INV_TRANS.TRANSACTION_ID
WHERE INVENTORY_TRANS.PART_ID = #PartID
AND INVENTORY_TRANS.PURC_ORDER_ID = #PurchaseOrderID
GROUP BY TRACE_INV_TRANS.QTY, TRACE_INV_TRANS.CREATE_DATE, TRACE_INV_TRANS.TRACE_ID, INVENTORY_TRANS.PART_ID
, INVENTORY_TRANS.PURC_ORDER_ID
The sample of trace_inventory_trans table is :
part_id trace_id transaction id qty create_date
x 1 10
x 2 11
x 3 12
the sample of inventory_trans table is :
transaction_id part_id purc_order_id
11 x p20
12 x p20
I wanted to have the result of biggest transaction which is transaction 12 but it shows me transaction 11
I would use a sub-query to find the MAX value, then join that result to the other table.
The ORDER BY + TOP (1) returns the MAX value for transaction_id.
SELECT
inv.transaction_id
,inv.part_id
,inv.purc_order_id
,tr.qty
,tr.create_date
,tr.trace_id
FROM
(
SELECT TOP (1)
transaction_id,
part_id,
purc_order_id
FROM
INVENTORY_TRANS
WHERE
part_id = #PartID
AND
purc_order_id = #PurchaseOrderID
ORDER BY
transaction_id DESC
) AS inv
JOIN
TRACE_INV_TRANS AS tr
ON inv.transaction_id = tr.transaction_id;
Results:
+----------------+---------+---------------+------+-------------+----------+
| transaction_id | part_id | purc_order_id | qty | create_date | trace_id |
+----------------+---------+---------------+------+-------------+----------+
| 12 | x | p20 | NULL | NULL | 3 |
+----------------+---------+---------------+------+-------------+----------+
Rextester Demo

How To Avoid TempTable in Union All when queries contain DIFFERENT order by and inner join?

What i am trying to do is always sending Product with 0 quantity to the end of an already sorted temp Table without losing current sorting (as i described in the following question How to send Zero Qty Products to the end of a PagedList<Products>?)
I have one Sorted temptable which is filled (it is sorted by what user has selected like Alphabetic , by Price or by Newer product,sorting is based identity id) :
CREATE TABLE #DisplayOrderTmp
(
[Id] int IDENTITY (1, 1) NOT NULL,
[ProductId] int NOT NULL
)
sorted #DisplayOrderTmp :
+------------+---------------+
| id | ProductId |
+------------+---------------+
| 1 | 66873 | // Qty is 0
| 2 | 70735 | // Qty is not 0
| 3 | 17121 | // Qty is not 0
| 4 | 48512 | // Qty is not 0
| 5 | 51213 | // Qty is 0
+------------+---------------+
I want pass this data to web-page, but before it i need to send product with zero quantity to the end of this list without loosing current Sorting by)
My returned data should be like this (sorting doesn't changed just 0 quantity products went to the end of list by their order):
CREATE TABLE #DisplayOrderTmp4
(
[Id] int IDENTITY (1, 1) NOT NULL,
[ProductId] int NOT NULL
)
+------------+---------------+
| id | ProductId |
+------------+---------------+
| 1 | 70735 |
| 2 | 17121 |
| 3 | 48512 |
| 4 | 66873 |
| 5 | 51213 |
+------------+---------------+
P.S: Its My product Table which i have to inner join with tmptable to find qty of products.
Product Table is like this :
+------------+---------------+------------------+
| id | stockqty | DisableBuyButton |
+------------+---------------+------------------+
| 17121 | 1 | 0 |
| 48512 | 27 | 0 |
| 51213 | 0 | 1 |
| 66873 | 0 | 1 |
| 70735 | 11 | 0 |
+------------+---------------+------------------+
What i have tried so far is this : (it works with delay and has performance issue i almost have 30k products)
INSERT INTO #DisplayOrderTmp2 ([ProductId])
SELECT p2.ProductId
FROM #DisplayOrderTmp p2 with (NOLOCK) // it's already sorted table
INNER JOIN Product prd with (NOLOCK)
ON p2.ProductId=prd.Id
and prd.DisableBuyButton=0 // to find product with qty more than 0
group by p2.ProductId order by min(p2.Id) // to save current ordering
INSERT INTO #DisplayOrderTmp3 ([ProductId])
SELECT p2.ProductId
FROM #DisplayOrderTmp p2 with (NOLOCK) //it's already sorted table
INNER JOIN Product prd with (NOLOCK)
ON p2.ProductId=prd.Id
and prd.DisableBuyButton=1 // to find product with qty equal to 0
group by p2.ProductId order by min(p2.Id) // to save current ordering
INSERT INTO #DisplayOrderTmp4 ([ProductId]) // finally Union All this two data
SELECT p2.ProductId FROM
#DisplayOrderTmp2 p2 with (NOLOCK) // More than 0 qty products with saved ordering
UNION ALL
SELECT p2.ProductId FROM
#DisplayOrderTmp3 p2 with (NOLOCK) // 0 qty products with saved ordering
Is there any way To Avoid creating TempTable in this query? send 0
quantity products of first temptable to the end of data-list without
creating three other tempTable , without loosing current ordering based by Identity ID.
My query has performance problem.
I have to say again that the temptable has a identity insert ID column and it is sorted based sorting type which user passed to Stored-Procedure.
Thank You All :)
Make sure the temp table has an index or primary key with Id as the leading column. This will help avoid sort operators in the plan for the ordering:
CREATE TABLE #DisplayOrderTmp
(
[Id] int NOT NULL,
[ProductId] int NOT NULL
,PRIMARY KEY CLUSTERED(Id)
);
With that index, you should be able to get the result without additional temp tables with reasonable efficiency using a UNION ALL query, assuming ProductID is the Product table primary key:
WITH products AS (
SELECT p2.Id, p2.ProductId, prd.stockqty, 1 AS seq
FROM #DisplayOrderTmp p2
JOIN Product prd
ON p2.ProductId=prd.Id
WHERE prd.stockqty > 0
UNION ALL
SELECT p2.Id, p2.ProductId, prd.stockqty, 2 AS seq
FROM #DisplayOrderTmp p2
JOIN Product prd
ON p2.ProductId=prd.Id
WHERE prd.stockqty = 0
)
SELECT ProductId
FROM products
ORDER BY seq, Id;
You mentioned in comments that you ultimately want a paginated result. This can be done in T-SQL by adding OFFSET and FETCH to the ORDER BY clause as below. However, be aware that pagination over a large result set will become progressively slower the further into the result one queries.
WITH products AS (
SELECT p2.Id, p2.ProductId, prd.stockqty, 1 AS seq
FROM #DisplayOrderTmp p2
JOIN Product prd
ON p2.ProductId=prd.Id
WHERE prd.stockqty > 0
UNION ALL
SELECT p2.Id, p2.ProductId, prd.stockqty, 2 AS seq
FROM #DisplayOrderTmp p2
JOIN Product prd
ON p2.ProductId=prd.Id
WHERE prd.stockqty = 0
)
SELECT ProductId
FROM products
ORDER BY seq, Id
OFFSET #PageSize * (#PageNumber - 1) ROWS
FETCH NEXT #PageSize ROWS ONLY;
You could use ORDER BY without using UNION ALL:
SELECT p2.ProductId
FROM #DisplayOrderTmp p2
JOIN Product prd
ON p2.ProductId=prd.Id
ORDER BY prd.DisableBuyButton, p2.id;
DisableBuyButton = 0 - qnt > 0
DisableBuyButton = 1 - qnt = 0
Seems it only needs an extra something in the order by.
An IIF or CASE can be used to give a priority to the sorting.
SELECT tmp.ProductId
FROM #DisplayOrderTmp tmp
JOIN Product prd
ON prd.Id = tmp.ProductId
AND prd.DisableBuyButton IN (0,1)
ORDER BY IIF(prd.DisableBuyButton=0,1,2), tmp.id;

SQL: doing several joins to evaluate different columns

I'm quite new to SQL but use it a lot now in my work now (Microsoft SQL Server).
So the issue is this: I collect data that is atypical for a certain column.
Let's say I got different Burgers and they should have a standardized calories value. So I did this with a query
------------------------------------------
| Burger | calories | numBurgers | Rank |
------------------------------------------
| Chicken| 600 | 20 | 1 |
| Chicken| 400 | 3 | 2 |
| Beef | 700 | 35 | 1 |
| Beef | 850 | 4 | 2 |
-------------------------------------------
To get a list of all the "wrong" burgers I use a temporary table and filter out GroupRank = 1
USE database;
GO
WITH GapRanking AS
(
SELECT TOP 100 PERCENT Burger, calories, COUNT(calories),
ROW_NUMBER() OVER(PARTITION BY Burger ORDER BY COUNT(calories) DESC) AS Rank
)
SELECT * FROM GapRanking
WHERE Rank <> 1
...
I get all combinations of Burgers and calories that are not "standard"
Then I do an Inner Join with the original table and all columns on the one above.
SELECT * FROM BaseTable as base
INNER JOIN
(SELECT * FROM GapRanking
WHERE Rank <> 1) AS err
ON (base.Burgers = err.Burgers
AND base.calories = err.calories)
This way I get a table with complete information about the "not-standard" burgers. So far so good.
Now I want to add other rows where there is a deviation in another criteria, price for example, not just calories and add it to the list if its not already there.
So I thought of UNION or JOIN.
So what is the best approach. UNION the above query with the same query just different column (price instead of calories)?
Or do a JOIN with the same query just different column (price instead of calories)?
The code gets quite "ugly" and I'm not sure if I do the right approach here.
Also because of me using the temporary table using WITH a UNION does not seem possible so easily.
I'm really glad for any ideas here. Cheers
use sub-query and join below is just sudo-code not actual you can follow like this way
select t1.*, t2.required_colum
(SELECT TOP 100 PERCENT Burger, calories, COUNT(calories),
ROW_NUMBER() OVER(PARTITION BY Burger ORDER BY COUNT(calories) DESC) AS Rank
) as t1
join
(SELECT TOP 100 PERCENT Burger, calories, COUNT(calories),
ROW_NUMBER() OVER(PARTITION BY Burger ORDER BY COUNT(calories) DESC) AS Rank
) as t2
on t1.colname = t2.colname
where t1.Rank != 1 and t2.Rank != 1

sql server max record value grouped by id tuple from separate table

There are many questions already on SO asking how to do a general max value with group by some id. However my particular case is somewhat different.
What I have is a record with a value that links to any unknown number of profiles associated (as a "team") with that record. For simplicity in the example each team has 2 profiles but the real example could have any size.
From these records I'm trying to create a leaderboard to show the max record from each unique team formation and should only show one result even if the team scored the same max value more than once.
In this example the unique teams are (1, 2) and (2, 3).
EDIT: Unique team formation means that the leaderboard should consider all records with profiles (1, 2) to be the same unique formation of a team (as a unique id if that helps) even though the same team may have been formed multiple times for different records.
In this example team (1,2) has a duplicate max record value of 1 which should ignore the duplicate.
Lets say was have 3 users:
Table: profile
profileId | name
1 | John
2 | James
3 | Mark
Then lets say there are currently the following records:
Table: record
recordId | value
1 | 1
2 | 1
3 | 2
4 | 3
And finally each record is made of the following teams described by their members:
Table: member
recordId | profileId
1 | 1
1 | 2
2 | 1
2 | 2
3 | 2
3 | 3
4 | 3
4 | 2
The final output should look like:
recordId | profileId1 | profileId2 | value
4 | 2 | 3 | 4
1 (or 2) | 1 | 2 | 0
So far I've seem something like this to do it if the group id was part of the record:
SELECT *
FROM (SELECT *,
ROW_NUMBER() OVER (PARTITION BY profileId ORDER BY value DESC) N
FROM record
) M WHERE N = 1
And this to actually get the unique tuples:
select max(r.value) as value, p1.profileId as p1, p2.profileId as p2
from record r
inner join profile p1 on p1.recordId = r.id
inner join profile p1 on p2.recordId = r.id
where p1.profileId < p2.profileId
group by p2.profileId, p2.profileId
However, I don't know how to piece it together to get the max record for each tuple of profiles.
Also, the second query isn't very scalable for any unknown number of profiles and if there is a way to do it without self joining based on the number of profiles that would be a bonus!
If someone can help me build the right query for SQL Server that would be awesome.
Thanks!
After a bunch more research and trial and error I came to the following answer that solves my problem. This query will give the top scores from each team for a 2 person team leaderboard:
SELECT * FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY m1.profileId, m2.profileId, m3.profileId ORDER BY r.value DESC) N,
m1.profileId AS m1, m2.profileId AS m2, m3.profileId AS m3, r.value, r.id
FROM record r
INNER JOIN member m1 ON m1.recordId = r.id
INNER JOIN member m2 ON m2.recordId = r.id
WHERE m1.profileId < m2.profileId
) R
where N = 1
ORDER BY value DESC;
It works by running a partition to rank all the records by team and then plucks only the record ranked 1. The where m1.profileId < m2.profileId ensures that only 1 permutation of the team is used in the results.

Resources