TSQL - Group by and sum not grouped column - sql-server

I'm facing some kind of problem. I have table "Prices" with columns - ProductId, ShopId, Date, Price.
Table contains history of prices for products in diffrent shops. Each product can be in diffrent shop with diffrent price and Date.
I want to get sum of the lastest prices in all shops for each product.
| ProductId | ShopId | Date | Price |
|:---------:|:------:|:----------:|:------:|
| 1 | 1 | 2020.11.10 | 100 |
| 1 | 2 | 2020.11.10 | 120 |
| 2 | 3 | 2020.11.10 | 200 |
| 3 | 3 | 2020.10.05 | 170 |
| 4 | 4 | 2020.11.10 | 200 |
| 4 | 4 | 2019.09.05 | 250 |
The output I want to get is (ShopId and Date can be included in output):
| ProductId | PriceSum |
|:---------:|:--------:|
| 1 | 220 |
| 2 | 200 |
| 3 | 170 |
| 4 | 200 |
I have following query:
SELECT ProductId, ShopId, MAX(Date) as MaxDate
FROM Prices
GROUP BY ShopId, ProductId
ORDER BY ProductId

I've found solution, not the fastest but working.
select st.ProductId, SUM(st.Price)
from Prices as p1
cross apply
(
select ProductId, ShopId, MAX(Date) as MaxDate
from Prices
group by ShopId, ProductId
) as p2
where p2.MaxDate = p1.Dt
and p2.Shopid = p1.ShopId
and p2.ProductId = p1.ProductId
group by p1.ProductId
order by p1.ProductId

In your case DENSE RANK window function can help you. If two or more rows have the same rank value in the same partition, each of those rows will receive the same rank.
WITH LatestPricesCTE AS
(
SELECT *, DENSE_RANK() OVER (PARTITION BY ProductID ORDER BY Date DESC) Rank
FROM Prices
)
SELECT ProductId, SUM(Price) PriceSum
FROM LatestPricesCTE
WHERE Rank = 1
GROUP BY ProductId
For more information:
https://learn.microsoft.com/en-us/sql/t-sql/functions/dense-rank-transact-sql?view=sql-server-ver15

Use window function to identity the latest dates and filter out older recs
;With dat
As (SELECT ProductId, ShopId, Date , Price
, row_number() over partition by prodictid, date order by date desc)r
FROM Prices)
Select Productid
, sum(price) Pricesum
From dat
Where rid=1
Group by productid;

Related

I have a table where customer ID are being duplicated because of their reactivation date. I need to pivot the reactivation date per CustomerID

I have a following table
I need to pivot the table and have it like the table below:
How can I have the unique customer ID in a column and all the reactivation dates pivoted like in the above picture?
To attribute a numeric sequence to the reactivation dates, use row_number() over() and then you can pivot that sequence from rows to columns:
select
customer_id
, activation_date
, [1] as reactivation_dt_1
, [2] as reactivation_dt_2
, [3] as reactivation_dt_3
, [4] as reactivation_dt_4
from (
select
customer_id, activation_date, reactivation_date
, row_number() over(partition by customer_id
order by reactivation_date ASC) as pivcol
from mytable
) as d
pivot (
max(reactivation_date)
for pivcol in ([1],[2],[3],[4])
) as p
order by
customer_id
result
+-------------+-----------------+-------------------+-------------------+-------------------+-------------------+
| customer_id | activation_date | reactivation_dt_1 | reactivation_dt_2 | reactivation_dt_3 | reactivation_dt_4 |
+-------------+-----------------+-------------------+-------------------+-------------------+-------------------+
| 1 | 2010-01-01 | 2012-02-01 | 2015-03-01 | 2017-07-01 | 2022-07-01 |
| 2 | 2011-12-03 | 2013-05-01 | 2014-08-10 | 2015-12-09 | |
+-------------+-----------------+-------------------+-------------------+-------------------+-------------------+
see db<>fiddle here

Group by a value if it exists otherwise group by another value of the same column

I have a table like this
| Id | ExternalId | Type | Date | StatusCode |
-------------------------------------------------------
| 1 | 123 | 25 | 2020-01-01 | A |
| 2 | 123 | 25 | 2020-01-02 | A |
| 5 | 125 | 25 | 2020-01-01 | A |
| 6 | 125 | 25 | 2020-01-02 | B |
| 3 | 124 | 25 | 2020-01-01 | B |
| 4 | 124 | 25 | 2020-01-02 | A |
I need to take just one row for each ExternalId having the Max(Date) and having the StatusCode = B if B exists, otherwise the StatusCode = A
So, the expected result is
| Id | ExternalId | Type | Date | StatusCode |
-------------------------------------------------------
| 2 | 123 | 25 | 2020-01-02 | A | <--I take Max Date and the StatusCode of the same row
| 6 | 125 | 25 | 2020-01-02 | B | <--I take Max Date and the StatusCode of the same row
| 3 | 124 | 25 | 2020-01-02 | B | <--I take Max Date and B, even if the Status code of the Max Date is A
Here the query I have tried to write:
SELECT ExternalId, Type, EntityType, Max(Date) as Date
From MyTable
group by ExternalId, Type, EntityType
But I cannot finish it.
If I understand your requirements, this could be, what you want:
SELECT ExternalId, Type, MAX(Date) AS Date, MAX(StatusCode) AS StatusCode
FROM MyTable
GROUP BY ExternalId, Type
Explanation:
You want the Max of StatusCode, because B is greater than A. You want the Max of Date, no matter what StatusCode is shown. And you want it for each ExternalId. Therefore you have to Group by ExternalId.
Furthermore, you Need also the Type shown, and as it's no group function, the query has to be grouped by type either. It's no problem though, because type is dependent on ExternalId ( or at least in your example data, it is).
As far as I understand from your sql, you also need to group by Type and EntityType. If it’s correct, you can write max with condition for 'B' and another max for all rows and use those results in isnull or coalesce function like this:
Select
t.ExternalId
,t.Type
,t.EntityType
,isnull(
max(iif(t.StatusCode='B', t.Date, null))
,max(t.Date)
) as Date
From MyTable t
Group by
t.ExternalId
,t.Type
,t.EntityType
You want to filter instead of aggregate. One solution is to use row_number():
select *
from (
select
t.*,
row_number() over(partition by ExternalId order by StatusCode desc, Date desc) rn
from mytable t
) t
where rn = 1
The order by clause of row_number() puts rows with StatusCode = 'B' first, and then orders by descending date.
This works because StatusCode has only two values, and because 'B'> 'A'. If your real data has different values (or more than 2 values), then you would need something more explicit, like:
order by case when StatusCode = 'B' then 0 else 1 end, Date desc
Here is the Query, Which can help you.
SELECT Externalid, MAX([Date]) as 'Date', MAX(StatusCode) 'StatusCode' from MyTable Group by Externalid
In your expected result, you have added the id column which cannot added here, if you want to have values from multiple rows.
Result will be
|123|2020-01-02|A|
|124|2020-01-02|B|
|125|2020-01-02|B|

Rank by top customers within each separate month -

I am having trouble ranking top customers by month. I created a new Rank column - but how do I break it up by month? Any help plz. Code and tables below:
The logic for ranking is selecting the top two customers per month from the tables. Also wrapped into the code (attempted at least) is renaming the date field and setting it to reflect end of month date only.
SELECT * FROM table1;
UPDATE table1
SET DATE=EOMONTH(DATE) AS MO_END;
ALTER TABLE table1
ADD COLUMN RANK INT AFTER SALES;
UPDATE table1
SET RANK=
RANK() OVER(PARTITION BY cust ORDER BY sales DESC);
LIMIT 2
Starting wtih
------+----------+-------+--+
| CUST | DATE | SALES | |
+------+----------+-------+--+
| 36 | 3-5-2018 | 50 | |
| 37 | 3-15-18 | 100 | |
| 38 | 3-25-18 | 65 | |
| 37 | 4-5-18 | 95 | |
| 39 | 4-21-18 | 500 | |
| 40 | 4-45-18 | 199 | |
+------+----------+-------+--+
desired end result
+------+---------+-------+------+--+
| CUST | MO_END | SALES | RANK | |
+------+---------+-------+------+--+
| 37 | 3-31-18 | 100 | 1 | |
| 38 | 3-25-18 | 65 | 2 | |
| 39 | 4-30-18 | 500 | 1 | |
| 40 | 4-45-18 | 199 | 2 | |
+------+---------+-------+------+--+
As a simple selection:
select *
from (
select
table1.*
, DENSE_RANK() OVER(PARTITION BY cust, EOMONTH(DATE) ORDER BY sales DESC) as ranking
from table1
)
where ranking < 3
;
If storing is important: I would not use [rank] as a column name as I avoid any words that are used in SQL, maybe [sales_rank] or similar.
with cte as (
select
cust
, DENSE_RANK() OVER(PARTITION BY cust, EOMONTH(DATE) ORDER BY sales DESC) as ranking
from table1
)
update cte
set sales_rank = ranking
where ranking < 3
;
There is really no reason to store the end of month, just use that function within the partition of the over() clause.
LIMIT 2 is not something that can be used in SQL Server by the way, and it sure can't be used "per grouping". When you use a "window function" such as rank() or dense_rank() you can use the output of those in the where clause of the next "layer". i.e. use those functions in a subquery (or cte) and then use a where clause to filter rows by the calculated values.
Also note I used dense_rank() to guarantee that no rank numbers are skipped, so that the subsequent where clause will be effective.

Second level Top N SQL server

I have a table with Names, Countries and Status. I want get total by grouping by Names and Status but get only Top 3 Countries.
My table:
+------+---------+--------+
| Name | Country | Status |
+------+---------+--------+
| ABC | US | Open |
| ABC | US | Closed |
| ABC | US | Open |
| ABC | Japan | Open |
| ABC | Japan | Closed |
| ABC | China | Open |
| ABC | China | Closed |
| ABC | Italy | Open |
| DEF | US | Open |
| DEF | US | Closed |
| DEF | Japan | Open |
| DEF | Japan | Closed |
| DEF | China | Open |
| DEF | China | Closed |
| DEF | China | Closed |
| DEF | Italy | Open |
+------+---------+--------+
Desired output:
+------+---------+--------+-------+
| Name | Country | Status | Total |
+------+---------+--------+-------+
| ABC | US | Open | 2 |
| ABC | US | Closed | 1 |
| ABC | Japan | Open | 1 |
| ABC | Japan | Closed | 1 |
| ABC | China | Open | 1 |
| ABC | China | Closed | 1 |
| DEF | US | Open | 1 |
| DEF | US | Closed | 1 |
| DEF | Japan | Open | 1 |
| DEF | Japan | Closed | 1 |
| DEF | China | Open | 1 |
| DEF | China | Closed | 2 |
+------+---------+--------+-------+
I tried the following query but it didn't give me result I am looking for.
Select rs.Name, rs.Country, rs.Status, Count(*) as total from (
SELECT Name, Country, Status, Rank()
over (Partition BY Name
ORDER BY Country DESC ) AS Rank
FROM table1 ) rs WHERE Rank <= 3
You can use the following query:
;With CTE AS (
SELECT Name, Country, Status,
COUNT(*) OVER (PARTITION BY Name, Country) AS cnt
FROM mytable
), CTE2 AS (
SELECT Name, Country, Status,
DENSE_RANK() OVER (PARTITION BY Name ORDER BY cnt DESC, Country) AS seq
FROM CTE
)
SELECT Name, Country, Status, COUNT(*) AS Total
FROM CTE2
WHERE seq <= 3
GROUP BY Name, Country, Status
ORDER BY Name, Country
In case of the ties, the query picks the Country having the 'smallest' name in comparison to the other countries.
Your original query was definitely in the right direction (I even used it to figure out what output you wanted). However, your desired output is the result of several aggregations, not just a single analytic function. In the query below I first aggregate to get totals, then use rank to retain the first 3 groups. In case of ties this query picks the country which comes alphabetically first.
SELECT t.Name,
t.Country,
t.Status,
t.Total
DENSE_RANK() OVER (PARTITION BY t.Name ORDER BY t.Total DESC, t.Country) AS rn
FROM
(
SELECT Name, Country, Status, COUNT(*) AS Total
FROM table1
GROUP BY Name, Country, Status
) t
WHERE rn <= 3
try this one..
Select rs.Name, rs.Country, rs.Status, Count(*) as total from rs(
SELECT Name, Country, Status, Count(status) from mytable
group by status order by Count(status) desc
) rs limit 3
How about:
select Top 3 with ties * FROM(
select Name, country, Status
, count(*) as total
, count(*) over (Partition BY Name, Country) as rank
from mytable
group by Name, Country, Status
) i
order by i.rank desc
Could you please try below SQL script
;with cte as (
select *, COUNT(*) over (partition by country) cnt
from table1
), t3 as (
select distinct top 3 country, cnt
from cte order by cnt desc
)
select distinct *
from cte
inner join t3 on cte.country = t3.country
Output is as follows
Use below query.
;WITH CTE
AS
(
SELECT NAME,COUNTRY,ROW_NUMBER() OVER (PARTITION BY NAME ORDER BY COUNT(COUNTRY) DESC) ROWNO FROM TBLCOUNTRY
GROUP BY NAME,COUNTRY
)
SELECT C.NAME,C.COUNTRY,C.STATUS,COUNT(C.STATUS) TOTAL FROM TBLCOUNTRY C INNER JOIN CTE ON
C.NAME=CTE.NAME AND C.COUNTRY=CTE.COUNTRY AND CTE.ROWNO<=3
GROUP BY C.NAME,C.COUNTRY,C.STATUS
ORDER BY NAME,COUNTRY DESC, STATUS DESC

Complex SQL query, not sure where to start

I have a tough one here I think. I have the following tables:
[Assets]
AssetId | Name
1 | Acura NSX
2 | Dodge Ram
[Assignments]
AssignmentId | AssetId | StartMileage | EndMileage | StartDate | EndDate
1 | 1 | 8000 | 10000 | 4/1/2015 | 5/1/2015
2 | 1 | 10000 | 16000 | 9/15/2015 | 1/5/2016
3 | 2 | 51000 | NULL | 1/1/2016 | NULL
[Reminders]
ReminderId | AssetId | Name | Distance | Time | Active
1 | 1 | Oil Change | 3000 (miles)| 3 (months)| 1
2 | 1 | Tire Rotation | 5000 | 6 | 0
3 | 2 | Oil Change | 3000 | 3 | 1
4 | 2 | Air Filter | 50000 | 48 | 1
[Maintenance]
MaintenanceId | AssetId | ReminderId | Mileage | Date | Vendor
1 | 1 | 1 | 10000 | 5/1/2015 | Jiffy Lube
2 | 2 | 3 | 51000 | 6/1/2015 | Dealership
I need a query that will join these 4 tables and return something like the following.
Name | Name | Current Mileage | Last Mileage | Last Date
Acura NSX | Oil Change | 16000 | 10000 | 5/1/2015
Dodge RAM | Oil Change | 51000 | 51000 | 6/1/2015
Dodge RAM | Air Filter | 51000 | -- | --
I need to take the distance threshold from the Reminders table and add it to the mileage from the Maintenance table then compare it to the start and end mileage from the Assignments table. If the threshold is greater than the start or end mileage then select the asset name, the name of the reminder, the current mileage (start or end mileage from Assignments, whichever is greater), and mileage and date from the last maintenance for that reminder. I need to do the same for time threshold. Add it to the date from the Maintenance table then compare it to today's date. If it's greater then display the asset.
Can one of you SQL gurus help me with this please?
UPDATE:
SELECT
v.Name,
r.Name AS Reminder,
a.CurrentMileage,
i.MaintenanceMileage,
i.MaintenanceDate
FROM
Assets v
LEFT JOIN
(SELECT AssetId,
COALESCE(EndMileage, StartMileage) AS CurrentMileage,
ROW_NUMBER() OVER (PARTITION BY AssetId
ORDER BY AssignmentId DESC) AS window_id
FROM Assignments) a
ON v.AssetId = a.AssetId
AND a.window_id = 1
JOIN
Reminders r
ON v.AssetId = r.AssetId
AND r.ActiveFlag = 1
LEFT JOIN
(SELECT AssetId,
ReminderId,
MAX(Mileage) AS MaintenanceMileage,
MAX([Date]) AS MaintenanceDate
FROM Maintenances
GROUP BY AssetId, ReminderId) i
ON r.ReminderId = i.ReminderId
AND (a.CurrentMileage > (NULLIF(i.MaintenanceMileage, 0) + r.DistanceThreshold))
OR (GETDATE() > DATEADD(m, r.[TimeThreshold], i.MaintenanceDate))
Here is a starting point:
SELECT v.Name AS [Asset Name], r.Name AS Reminder, a.CurrentMileage,
m.Mileage + r.Distance AS [Last Mileage], m.[Date] AS [Last Date]
FROM Assets v
JOIN ( -- get the latest relevant row as window_id = 1
SELECT AssetId, COALESCE(EndMileage, StartMileage) AS CurrentMileage,
COALESCE(EndDate, StartDate) AS AssignDate,
ROW_NUMBER() OVER (partition by AssetId
order by COALESCE(EndDate, StartDate) DESC) AS window_id
FROM Assignments
) a
ON v.AssetId = a.AssetId
AND a.window_id = 1
JOIN Reminders r
ON v.AssetId = r.AssetId
AND r.Active = 1
LEFT JOIN Maintenance m
ON r.AssetId = m.AssetId
AND r.ReminderId = m.ReminderId
-- corrected
AND ((a.CurrentMileage > (NULLIF(m.Mileage, 0) + r.Distance))
-- slightly oversimplified
OR (GETDATE() > DATEADD(m, r.[Time], COALESCE(m.[Date], a.AssignDate))))
The date calculations are slightly oversimplified because they use the latest assignment dates. What you would really want is a column Assets.InServiceDate that would anchor the time before the first maintenance would be due. But this will get you started.

Resources