Using ROW_NUMBER instead of MAX()

Using ROW_NUMBER instead of MAX() - sql-server

My question is related to this one. I have the following query to get the mother tongue and the fluent languages for an employee :
SELECT
lan.AdminFileId
,MAX(lan.MotherTongue) AS MotherTongue
,MAX(lan.Fluent) AS Fluent
FROM (SELECT
al.AdminFileId
,MAX(CASE
WHEN al.LanguageLevelId = 4 THEN l.Label
END) AS MotherTongue
,MAX(CASE
WHEN al.LanguageLevelId = 2 THEN l.Label
END) AS Fluent
FROM AF_Language al
LEFT JOIN AF_AdminFile aaf ON aaf.AdminFileId=al.AdminFileId
INNER JOIN Employee e ON e.AdminFileId=aaf.AdminFileId
LEFT JOIN Language l ON al.LanguageId = l.ID
GROUP BY al.AdminFileId
,l.Label
,al.LanguageLevelId) AS lan
GROUP BY lan.AdminFileId
The output is like below :
AdminFileId MotherTongue Fluent
45 English French
67 Spanish English
88 Arabic English
How can I use ROW_NUMBER to get the same result?
Essentially

I think you don't need to use ROW_NUMBER(), use AdminFileId only in GROUP BY clause will have only single entry :
SELECT al.AdminFileId,
MAX(CASE WHEN al.LanguageLevelId = 4 THEN l.Label END) AS MotherTongue,
MAX(CASE WHEN al.LanguageLevelId = 2 THEN l.Label END) AS Fluent
FROM AF_Language al LEFT JOIN
AF_AdminFile aaf
ON aaf.AdminFileId = al.AdminFileId LEFT JOIN -- Used LEFT JOIN INSTEAD OF INNER
Employee e
ON e.AdminFileId = aaf.AdminFileId LEFT JOIN
Language l
ON al.LanguageId = l.ID
GROUP BY al.AdminFileId;
EDIT : Using row_number :
SELECT al.AdminFileId, l.Label,
ROW_NUMBER() OVER (PARTITION BY al.AdminFileId
ORDER BY (CASE WHEN al.LanguageLevelId = 4
THEN 1 ELSE 2
END)
) AS Seq
FROM AF_Language al LEFT JOIN
AF_AdminFile aaf
ON aaf.AdminFileId = al.AdminFileId LEFT JOIN -- Used LEFT JOIN INSTEAD OF INNER
Employee e
ON e.AdminFileId = aaf.AdminFileId LEFT JOIN
Language l
ON al.LanguageId = l.ID
WHERE al.LanguageLevelId IN (4, 2);
Then you can use sub-query :
SELECT AdminFileId,
MAX(CASE WHEN Seq = 1 THEN Label END) AS MotherTongue,
MAX(CASE WHEN Seq = 2 THEN Label END) AS Fluent
FROM ( <Query>
) t
GROUP BY AdminFileId;

Related

How to divide two queries and then group by?

I need to divide two queries, but I need to save 'group by' categories. With my query I only get values and their cartesian product.
Select m2.regionname, m2.indicatorname CAST( m2.a2Value as float) /
m1.a1Value
from(
select r.name as regionname , ina.name as indicatorname, sum(a.value) as
a1Value
from Region as "r"
left join city_region as "cr" on r.region_id = cr.region_id
left join Office as "o" on cr.city_id = o.city_id
left join Assets as "a" on o.office_id = a.office_id
left join Indicators as "i" on a.indicator_id = i.indicator_id
left join IndicatorNames as "ina" on i.indicator_name_id =
ina.indicator__name_id
where a.month between '01-01-2019' and '31-01-2019'
group by r.name, ina.name
) m1 join (
select r.name as regionname , ina.name as indicatorname, sum(a.value) as
a2Value
from Region as "r"
left join city_region as "cr" on r.region_id = cr.region_id
left join Office as "o" on cr.city_id = o.city_id
left join Assets as "a" on o.office_id = a.office_id
left join Indicators as "i" on a.indicator_id = i.indicator_id
left join IndicatorNames as "ina" on i.indicator_name_id =
ina.indicator__name_id
where a.month between '01-02-2019' and '27-02-2019'
group by r.name, ina.name) m2 on m1.regionname = m2.regionname
I need to get 4 rows and 3 columns, that includes region_name, indicator_name and float value.
But I only cant get table with values
0,0482248520710059
0,0565972222222222
0,0665680473372781
0,078125
0,705627705627706
0,974025974025974
1,01875
1,03550295857988
1,18343195266272
1,21527777777778
1,38888888888889
1,40625
15,1515151515152
17,3160173160173
21,875
25
but that is wrong.

This condition in the ON clause:
on m1.regionname = m2.regionname
will join many unrelated rows.
You must set another condition like:
on m1.regionname = m2.regionname and m1.indicatorname = m2.indicatorname

try Something like this:
select *, case when a1Value=0 then null else cast(a2Value as float) / a1Value end Ratio
from (
select r.name as regionname , ina.name as indicatorname,
sum(case when a.month between '01-01-2019' and '31-01-2019' then a.value else 0 end) as a1Value,
sum(case when a.month between '01-02-2019' and '27-02-2019' then a.value else 0 end) as a2Value
from Region r
left join city_region cr on r.region_id = cr.region_id
left join Office o on cr.city_id = o.city_id
left join Assets a on o.office_id = a.office_id and a.month between '01-01-2019' and '27-02-2019'
left join Indicators i on a.indicator_id = i.indicator_id
left join IndicatorNames ina on i.indicator_name_id = ina.indicator__name_id
group by r.name, ina.name
) tmp

Count item corresponding listed Item

SELECT a.UPC,COUNT(*)
FROM StoreTransactions a WITH (NOLOCK)
JOIN StoreTransactions_Expanded_UOM c
ON a.StoreTransactionID = c.StoreTransactionID
LEFT JOIN ProductCatalog cat
ON a.ProductID = cat.ProductID
LEFT JOIN ProductCatalogBase base
ON cat.ProductCatalogID = base.ProductCatalogID
JOIN ProductIdentifiers d
ON cat.ProductID = d.ProductID AND d.ProductIdentifierTypeID = 2
GROUP BY a.UPC
, d.IdentifierValue, cat.PackDesc, a.ReportedCost,
base.ManualHigh, base.ManualLow,cat.DateTimeCreated,cat.DateTimeLastUpdate
ORDER BY count(*) desc
I want count of UPC corresponding UPC but not getting correct result like below.
UPC Count
071990316006 1463
026565245455 4530

You only need to group by a.UPC:
SELECT a.UPC,COUNT(*)
FROM StoreTransactions a WITH (NOLOCK)
JOIN StoreTransactions_Expanded_UOM c
ON a.StoreTransactionID = c.StoreTransactionID
LEFT JOIN ProductCatalog cat
ON a.ProductID = cat.ProductID
LEFT JOIN ProductCatalogBase base
ON cat.ProductCatalogID = base.ProductCatalogID
JOIN ProductIdentifiers d
ON cat.ProductID = d.ProductID AND d.ProductIdentifierTypeID = 2
GROUP BY a.UPC
ORDER BY count(*) desc

Sql Error with Table pivot

I am trying to implement a pivoted table in sql but it is not working. What I currently have is the following:
WITH Pivoted
AS
(
select vg.ParentProductCategoryName, c.CompanyName, sd.LineTotal
FROM SalesLT.Product p join SalesLT.vGetAllCategories vg on p.ProductCategoryID = vg.ProductCategoryID
Join SalesLT.SalesOrderDetail sd on p.ProductID = sd.ProductID
JOIN SalesLT.SalesOrderHeader as soh ON sd.SalesOrderID = soh.SalesOrderID
JOIN SalesLT.Customer AS c ON soh.CustomerID = c.CustomerID
pivot(Sum([LineTotal]) for [ParentProductCategoryName] in (Accessories, Bikes, Clothing, Components)) AS sales
)
select * from Pivoted p;
;
I get the error:
multi part "Column name" Identifier could not be bounded.
If I removed the column names in the select part and used * instead, I get:
The column 'ProductCategoryID' was specified multiple times for...
What I want is to have a view of the total Revenue (as specified by the sum of the lineTotal in the SalesOrderDetail Table) per each ParentProductCategoryName (in vGetAllCategories) stated (pivoted as columns) with respect to each CompanyName (in Customer). How to better achieve this? Thanks.

Not sure why you'd need a CTE for this.. but put your JOINS in a derived table and pivot that derived table instead.
SELECT *
FROM (SELECT vg.ParentProductCategoryName,
c.CompanyName,
sd.LineTotal
FROM SalesLT.Product p
JOIN SalesLT.vGetAllCategories vg ON p.ProductCategoryID = vg.ProductCategoryID
JOIN SalesLT.SalesOrderDetail sd ON p.ProductID = sd.ProductID
JOIN SalesLT.SalesOrderHeader AS soh ON sd.SalesOrderID = soh.SalesOrderID
JOIN SalesLT.Customer AS c ON soh.CustomerID = c.CustomerID
) t
PIVOT( SUM([LineTotal])
FOR [ParentProductCategoryName] IN (Accessories,Bikes,Clothing,Components) ) AS sales
or you could just use the SUM aggregate with CASE
SELECT c.CompanyName,
Accessories = SUM(CASE WHEN vg.ParentProductCategoryName = 'Accessories' THEN sd.LineTotal END),
Bikes = SUM(CASE WHEN vg.ParentProductCategoryName = 'Bikes' THEN sd.LineTotal END),
Clothing = SUM(CASE WHEN vg.ParentProductCategoryName = 'Clothing' THEN sd.LineTotal END),
Components = SUM(CASE WHEN vg.ParentProductCategoryName = 'Components' THEN sd.LineTotal END)
FROM SalesLT.Product p
JOIN SalesLT.vGetAllCategories vg ON p.ProductCategoryID = vg.ProductCategoryID
JOIN SalesLT.SalesOrderDetail sd ON p.ProductID = sd.ProductID
JOIN SalesLT.SalesOrderHeader AS soh ON sd.SalesOrderID = soh.SalesOrderID
JOIN SalesLT.Customer AS c ON soh.CustomerID = c.CustomerID
GROUP BY c.CompanyName

Top N percent Desc and Top M percent Asc

I am trying to get top 5 customertypes and show data for each 5 customer types, The balance (which can be any amount) I show them as "Other Customer Types". my issue is since the rows can be random and not perfectly divisible by a number then there can be repeated values in the top 5 showing up in the "Other" group which overstates the Total sales.
the Data is also being rendered in SSRS
My code using TOP PERCENT:
select final.[description], sum(final.YTDSales$) as YTDSales$
FROM(
select top 25 percent pytd2.[Description], sum(pytd2.YTDSales$) as YTDSales$
FROM(
-- ytd sales
select re.SIC_Desc as [description], sum((ol.NetAmt - ol.WhlOrdDiscAmt) / #exrt) AS YTDSales$
from dbo.order_line_invoice ol
INNER JOIN dbo.Vendor vd ON ol.Cono = vd.Cono AND vd.VendId = ol.VendId
inner join Product_Warehouse pw on ol.ProdId = pw.prodid and ol.WhseId = pw.whseid and ol.cono = pw.cono
inner join Customer c on ol.custId = c.CustId and ol.Cono = c.Cono
left join MDData.dbo.RetailEnvironment re on c.SIC = re.SIC
where ol.InvoiceDate BETWEEN #FStartDate AND #EndDate AND ol.Cono = 1 and ol.VendId IN(#Vendid) and ol.prodcatid NOT LIKE 'GP%'
group by re.SIC_Desc
)PYTD2
group by pytd2.[description]
order by sum(pytd2.YTDSales$) DESC
UNION ALL
select top 75 percent 'Other' as 'description', sum(pytd.YTDSales$) as YTDSales$
FROM(
-- ytd sales
select re.SIC_Desc as [description], sum((ol.NetAmt - ol.WhlOrdDiscAmt) / #exrt) AS YTDSales$
from dbo.order_line_invoice ol
INNER JOIN dbo.Vendor vd ON ol.Cono = vd.Cono AND vd.VendId = ol.VendId
inner join Product_Warehouse pw on ol.ProdId = pw.prodid and ol.WhseId = pw.whseid and ol.cono = pw.cono
inner join Customer c on ol.custId = c.CustId and ol.Cono = c.Cono
left join MDData.dbo.RetailEnvironment re on c.SIC = re.SIC
where ol.InvoiceDate BETWEEN #FStartDate AND #EndDate AND ol.Cono = 1 and ol.VendId IN(#Vendid) and ol.prodcatid NOT LIKE 'GP%'
group by re.SIC_Desc
)PYTD
group by Ppytd.[description]
order by sum(pytd.YTDSales$)
)final
group by final.[Description]
order by sum(final.YTDSales$) DESC
my results:
As you can see the Large Independent and Other has the same figure of $2280.60 in YTDQty since it is being repeated

I was picturing something like this:
with data as (
-- your base query here grouped and summarized by customer type
), rankedData as (
select *, row_number() over (order by YTDSales$ desc) as CustTypeRank
from data
)
select
case when CustTypeRank <= 5 then min("description") else 'Others' end as "description",
sum(YTDSales$) as YTDSales$
from rankedData
group by case when CustTypeRank <= 5 then CustTypeRank else 999 end
order by case when CustTypeRank <= 5 then CustTypeRank else 999 end

I actually used RANK instead which worked great :-
select 0 as rankytd, RANK() OVER(ORDER BY sum(ol.NetAmt - ol.WhlOrdDiscAmt) DESC) as rankpytd, re.sic, ol.VendId, vd.name, re.SIC_Desc As [description], 0 AS YTDQty, sum(ol.Quantity) AS PYTDQty
from dbo.order_line_invoice ol
INNER JOIN dbo.Vendor vd ON ol.Cono = vd.Cono AND vd.VendId = ol.VendId
inner join dbo.Product p on ol.Cono = p.Cono and ol.prodid = p.ProdId and p.ProdCatId in (#pcat)
inner join Product_Warehouse pw on ol.ProdId = pw.prodid and ol.WhseId = pw.whseid and ol.cono = pw.cono
inner join Customer c on ol.custId = c.CustId and ol.Cono = c.Cono
left join MDData.dbo.RetailEnvironment re on c.SIC = re.SIC
where ol.InvoiceDate BETWEEN DATEADD(YEAR, -1,#FStartDate) AND DATEADD(YEAR, -1, #EndDate) and ol.Cono = 1 and ol.VendId IN(#Vendid) and ol.prodcatid NOT LIKE 'GP%'
group by re.sic, ol.VendId, vd.Name, re.SIC_Desc

GROUP BY in SQL Server in complex query

I need to group this by T.TopicID to only receive the last result.
Whatever I try I get errors like the other T. items rant included in group by or aggregate etc
ALTER PROCEDURE [dbo].[SPGetFollowingTopics]
#id int = null
,#UserGroupId int = null
,#lastvisit DateTime = null
AS
SELECT *
FROM
(SELECT
ROW_NUMBER() OVER (ORDER BY TopicOrder DESC,
(CASE
WHEN M.MessageCreationDate > T.TopicCreationDate
THEN M.MessageCreationDate
ELSE T.TopicCreationDate
END) DESC) AS RowNumber,
T.TopicId, T.TopicTitle, T.TopicShortName,
T.TopicDescription, T.TopicCreationDate, T.TopicViews,
T.TopicReplies, T.UserId, T.TopicTags, T.TopicIsClose,
T.TopicOrder, T.LastMessageId, U.UserName,
M.MessageCreationDate, T.ReadAccessGroupId,
T.PostAccessGroupId, TF.userid AS Expr1, U.UserGroupId,
U.UserPhoto, U.UserFullName, M.UserId AS MessageUserId,
MU.UserName AS MessageUserName
FROM
Topics AS T
LEFT OUTER JOIN
Messages AS M ON M.TopicId = T.TopicId AND M.Active = 1 AND M.MessageCreationDate < #lastvisit
INNER JOIN
topicfollows AS TF ON T.TopicId = TF.topicid
INNER JOIN
Users AS U ON U.UserId = T.UserId
LEFT JOIN
Users MU ON MU.UserId = M.UserId
WHERE
(TF.userid = #id)
) T

It isn't clear what the requirement is (in my view) but I think you are seeking:
"the latest message"
PER TOPIC
for a given user
In this situation ROW_NUMBER() is a good option but I believe you need to PARTITION the ROW_NUMBER as well as ordering it.
SELECT
*
FROM (
SELECT
ROW_NUMBER() OVER (PARTITION BY TF.userid, T.TopicId
ORDER BY
(CASE
WHEN M.MessageCreationDate > T.TopicCreationDate THEN M.MessageCreationDate
ELSE T.TopicCreationDate
END) DESC) AS ROWNUMBER
, T.TopicId, T.TopicTitle, T.TopicShortName, T.TopicDescription
, T.TopicCreationDate, T.TopicViews, T.TopicReplies, T.UserId
, T.TopicTags, T.TopicIsClose, T.TopicOrder, T.LastMessageId
, U.UserName, M.MessageCreationDate, T.ReadAccessGroupId
, T.PostAccessGroupId, TF.userid AS EXPR1
, U.UserGroupId, U.UserPhoto, U.UserFullName
, M.UserId AS MESSAGEUSERID, MU.UserName AS MESSAGEUSERNAME
FROM Topics AS T
LEFT OUTER JOIN Messages AS M ON M.TopicId = T.TopicId
AND M.Active = 1
AND M.MessageCreationDate < #lastvisit
INNER JOIN topicfollows AS TF ON T.TopicId = TF.topicid
INNER JOIN Users AS U ON U.UserId = T.UserId
LEFT JOIN Users MU ON MU.UserId = M.UserId
WHERE (TF.userid = #id)
) T
WHERE ROWNUMBER = 1

You could change your left join to any outer apply, and add TOP 1:
SELECT ...
FROM
Topics AS T
OUTER APPLY
( SELECT TOP 1 M.MessageCreationDate, M.UserId
FROM Messages AS M
WHERE M.TopicId = T.TopicId
AND M.Active = 1
AND M.MessageCreationDate < #lastvisit
ORDER BY M.MessageCreationDate DESC
) AS m
This allows you to use TOP 1 and still get one row per topicID
Alternatively you can use ROW_NUMBER() OVER(PARTITION BY m.TopicID ORDER BY M.MessageCreationDate DESC)
SELECT ...
FROM
Topics AS T
LEFT OUTER JOIN
( SELECT M.TopicId,
M.MessageCreationDate,
M.UserId,
RowNum = ROW_NUMBER() OVER(PARTITION BY m.TopicID ORDER BY M.MessageCreationDate DESC)
FROM Messages AS M
WHERE M.Active = 1
AND M.MessageCreationDate < #lastvisit
) AS m
ON M.TopicId = T.TopicId
AND m.RowNum = 1
I would test both methods and see which one works best for you.