Sql Group by and Ordering - sql-server

I have three tables. 1)Category table, 2)table having Reports belonging to different categories and 3)an event table which keeps track of reports being accessed. I’m writing a stored procedure to rank and display either top ten or all the reports Ordered by category, reports belonging to which have gotten the maximum number of hits. Trend is calculated by comparing the hits received by report in the current period to the hits received by previous period.
This is the simplified schema of the tables.
ReportCategory Table
RportCategoryId CategoryName
ApplicationReport Table
ApplicationReportId ReportCategoryId ReportName
Events Table
EventId ReportId CreatedDate
If during the period reports belonging to Inventory gets maximum number of hits and Sales second most then the result should be like
Rank Category ReportName Percentage_Change Trend
1 Inventory Inventory Turn 42% Up
Inventory Inventory Stock 18% Up
2 Sales Discounted Sales 12% Down
Sales Sales return 30% Up
This is what I've come up with
SELECT TOP 10 T1.CategoryName AS Category,T1.ReportName, (ABS(100 * (T1.TotalCategoryHits - T2.TotalCategoryHits)) / T2.TotalCategoryHits)+'%' AS PercentageChange,
CASE When (T1.TotalCategoryHits - T2.TotalCategoryHits) > 0 Then 'Upward'
WHEN (T1.TotalCategoryHits - T2.TotalCategoryHits) < 0 THEN 'Downward'
ELSE 'No Change' END AS Trend
FROM
(SELECT COUNT(c.CategoryId) AS TotalCategoryHits,COUNT(r.ReportCategoryId) Total,c.CategoryName,r.ReportName FROM Event e
inner join Reports r ON r.ReportId = CAST(e.ReportId As INT)
inner join Category c ON r.CategoryId = c.CategoryId
WHERE e.CreatedDate >= #CurrentPeriodStartDate AND e.CreatedBy <= #CurrentPeriodEndDate
GROUP BY c.ReportCategoryName,r.ReportName,e.ReportId) T1
INNER JOIN
(SELECT COUNT(c.CategoryId) AS TotalCategoryHits,COUNT(r.ReportCategoryId) Total,c.CategoryName,r.ReportName FROM Event e
inner join Reports r ON r.ReportId = CAST(e.ReportId As INT)
inner join Category c ON r.CategoryId = c.CategoryId
where e.CreatedDate >= #PrevPeriodStartDate And e.CreatedDate <= #PrevPeriodEndDate
GROUP BY c.ReportCategoryName,r.ReportName,e.ReportId) T2 on T1.ReportId = T2.ReportId
ORDER BY T1.TotalCategoryHits DESC
and I'm nowhere near the desired result.
I want the rows to be ordered by the Category which has gotten the maximum number of hits but when I include ReportName in the Group By to make it part of the result I lose the ordering.And I also dont know how to display the rank.
EDIT: Sql Fiddle
Names of the columns are different from my description here because I have used the schema of original Database.For some unknown reason my query is not working.May be because I have changed some column names and types.But schema is there.

Related

Find the average and total revenue by each sub-category for the categories which are among top 5 categories in terms of quantity sold? <SQL Server>

For this question, I have two tables and are as follows :
prod_cat_info --- This table has the following columns:
prod_cat : It contains the products' category names
prod_cat_id : It contains the products' category ID. Note that every product category has been assigned a unique ID. For example :: Lets say I have following product categories Books,Sports,Electronics. So these 3 product categories will be assigned product category ID as 1,2 & 3 respectively.
prod_subcat : It contains products' subcategories
prod_subcat_id : It contains products' subcategories ID
Now how this product subcategories are stored. For example : Lets say for product category "Books", I have 3 product subcategories like "Novels", "Schoolbooks" & "Fiction". So in this case also, each and every product subcategory would be assigned an ID like 1,2,3 and so on.
Transactions --- This is another table which has the following columns :
total_amt : It contains amount paid by customer when a transaction took place.
Qty : It contains quantities ordered by customer of a particular product.
prod_subcat_id : It contains products' subcategories ID
prod_cat_id : It contains the products' category ID.
Cust_ID : It contains customer ID [Irrelevant column in case of this question]
What I did is, I break this question into 2 parts & wrote 2 separate queries. Query is given below. I am not able to figure out how to join these 2 queries in order to achieve the output.
For my query1 - I have fetched all the product subcategories.
In query2 - I have fetched the top 5 product categories based on quantities sold.
Now I feel that Query2 can be used as a subquery in Query1 inside WHERE clause.
But It may require some modifications because what I know is that orderby can't be used in subquery & also result of a subquery will be a single output.
Therefore, I need some help on how can I combine/modify this query in order to achieve the result.
**Query1**
select P.prod_subcat as Product_SubCategory,
AVG(cast(total_amt as float)) as Average_Revenue,
SUM(cast(total_amt as float)) as Total_Revenue
from Transactions as T
INNER JOIN prod_Cat_info as P
ON T.prod_cat_code = P.prod_cat_code AND T.prod_subcat_code =
P.prod_sub_cat_code
group by P.prod_subcat
**Query2**
select top 5 P.prod_cat, sum(Cast(Qty as int)) AS Quantities_sold from
prod_cat_info as P
inner join Transactions as T
ON P.prod_cat_code = T.prod_cat_code AND P.prod_sub_cat_code =
T.prod_subcat_code
group by P.prod_cat
order by sum(Cast(Qty as int)) desc
If you have a TOP operator with ORDER BY, which is exactly your case, then you can use order by in a subquery. Because in this case the ORDER BY is used to determine the rows returned by the TOP clause.
And for multiple values you can use IN operator
select P.prod_subcat as Product_SubCategory,
AVG(cast(total_amt as float)) as Average_Revenue,
SUM(cast(total_amt as float)) as Total_Revenue
from Transactions as T
INNER JOIN prod_Cat_info as P
ON T.prod_cat_code = P.prod_cat_code AND T.prod_subcat_code =
P.prod_sub_cat_code
WHERE P.prod_cat_code IN (
select top 5 P.prod_cat_code
from prod_cat_info as P
inner join Transactions as T
ON P.prod_cat_code = T.prod_cat_code AND P.prod_sub_cat_code =
T.prod_subcat_code
group by P.prod_cat
order by sum(Cast(Qty as int)) desc
)
group by P.prod_subcat
Select prod_cat, prod_subcat , avg(total_amt) as average_amount , sum(total_amt) as total_amount
From transactions as t
inner join prod_cat_info as p
on t.prod_subcat_code=p.prod_sub_cat_code and t.prod_cat_code = p.prod_cat_code
Where prod_cat in
(Select Top 5 prod_cat
From transactions as t
inner join prod_cat_info as p
on t.prod_subcat_code=p.prod_sub_cat_code and t.prod_cat_code = p.prod_cat_code
Where total_amt > 0 and qty > 0
Group by prod_cat
Order by count(qty) desc)
Group by prod_subcat
Order by prod_cat asc;

Getting a derived attribute through two different tables

I need to get the salary of an employee through two different tables: his gains and his discounts. The relation from the table employee to the two tables is a many to many relation. So I need to take the employee_id, get all the gain_id in the employee_gains table, add all of them and subtract with the analogue result in the discounts.
I tried this creating a view for the salary:
CREATE VIEW salary as
select ((select sum(value) from gains
where gain_id in (select gain_id from gain_employee where employee_id=2))
-
(select sum(value) from discount
where discount_id in (select gain_id from discount_employee where employee_id=2)));
However, this only (and successfully) gives me the salary for the employee with ID 2. But how can I make this generic? I want a view salary for all the employees.
I would suggest you to use two CTEs to calculate the gains and discounts and then do a FULL OUTER JOIN on the two sets. This will ensure that you get proper values such as 0 for missing gains or discounts for an employee_id. If you want to ignore them such cases just change it to a plain INNER JOIN
CREATE OR REPLACE VIEW V_salary AS --give proper name to indicate it's a view
WITH ge
AS (SELECT e.employee_id,
SUM(g.value) AS gain_value
FROM gain_employee e
JOIN gains g --use left join if some employees don't
--have an entry in gains
ON e.gain_id = g.gain_id
GROUP BY e.employee_id),
de
AS (SELECT e.employee_id,
SUM(d.value) AS dis_value
FROM discount_employee e
JOIN discounts d --use left join if some employees don't
--have an entry in discount
ON e.discount_id = d.discount_id
GROUP BY e.employee_id)
SELECT COALESCE(ge.employee_id, gd.employee_id), --gets you atleast one of
--them when one may be missing.
COALESCE(ge.gain_value, 0) - COALESCE(de.dis_value, 0) AS salary
FROM ge
FULL OUTER JOIN de -- to consider case where one of them is absent
ON ge.employee_id = de.employee_id;
This should do it -
CREATE VIEW salary as
select S1.employee_id, (S1.gains - S2.discounts) as salary
from (select ge.employee_id, sum(g.value) as gains
from gain_employee ge, gains g
where ge.gain_id = g.gain_id
group by ge.employee_id) S1,
(select de.employee_id, sum(d.value) as discounts
from discount_employee de, discounts d
where de.doscount_id = d.discount_id
group by de.employee_id) S2
where S1.employee_id = S2.employee_id;
Then you can query this view with employee_id as the condition.

SQL query select geography point

I'm new in SQL querying and I need to get the last position of some players that are active, meaning they have Play value equal to 1. I have to make a join between the table Events where I have the players activity with columns:
ID: unique row id
Timestamp: time when the player changed his status for active to inactive or active again
PlayerId: id of the player that this event is for
Active: 1-Active, 0-Inactive
with the table Positions where I have all players position at every 2-3 seconds with columns:
PlayerId
Timestamp: time when the position was received
Location: a geography point with the received position
in order to get the current latitude and longitude of the players that are active.
My current query:
select
e.PlayerId, e.Active,
p.Location.Lat, p.Location.Long,
max_date = max(e.Timestamp),
max_date2 = max(p.Timestamp)
from
Events e
inner join
Positions p on e.PlayerId = p.PlayerId
where
e.Active= 1
group by
e.PlayerId, e.Active, p.Location.Lat, p.Location.Long
but instead of returning 2 rows I get much more. I guess it's because of the b.Location.Lat, b.Location.Long fields inside the group by clause because the simple query:
select
e.PlayerId, e.Active,
max_date = max(e.Timestamp),
max_date2 = max (p.Timestamp)
from
Events e
inner join
Positions p on e.PlayerId = p.PlayerId
where
e.Active = 1
group by
e.PlayerId, e.Active
returns the correct 2 rows but I need to also get the Lat-Long columns.
UPDATE
I found an issue inside my query. When I've run it again for different values I've seen that it returns all the players position if they were even only once active and after that they got inactive. But if the last value for Active (according to the maximum timestamp) for one user is 0, then the query should remove the player location from the response.
Is there any way that I can add those columns without getting more rows than needed?
You could wrap your current query in an outer query, then join to your positions table again. Something like this;
SELECT
base.PlayerId
,base.Active
,base.max_date
,base.max_date2
,p.Location.lat
,p.Location.long
FROM
(
SELECT a.PlayerId ,
a.Active,
max_date = max( a.Timestamp ),
max_date2 = max (b.Timestamp)
FROM Events a
INNER JOIN Positions b
ON a.PlayerId =b.PlayerId
WHERE a.Active= 1
GROUP BY a.PlayerId , a.Active
) base
JOIN Positions p
ON base.PlayerId = p.PlayerId
AND base.max_date2 = p.Timestamp
The reason your other query wasn't working is that you're going to have an entry for each lat & long point. Doing this will give you the unique list that you're after, then joins to Positions again to get lat long info.
Edit: As per the comments, if you want to exclude anybody with the latest Active value set to zero then add this to the end of the code;
JOIN
(
SELECT
e.PlayerID
,MAX(e.Timestamp) Timestamp
FROM Events e
GROUP BY e.PlayerID
) latest
ON base.PlayerID = latest.PlayerID
JOIN Events e2
ON latest.PlayerID = e2.PlayerID
AND latest.Timestamp = e2.Timestamp
WHERE e2.Active <> 0

Counting duplicate items in different order

Goal:
To know if we have purchased duplicate StockCodes or Stock Description more than once on difference purchase orders
So, if we purchase Part ABC on Purchase Order 1 and Purchase Order 2, it should return the result of
PurchaseOrders, Part#, Qty
Purchase Order1, Purchase Order2, ABC, 2
I just don't know how to pull the whole code together, more to the point, how do I know if it's occurred on more than 1 Purchase Order without scrolling through all the results , may also have to do with Multiple (Having Count) Statements as I only seem to be doing by StockCode
SELECT t1.PurchaseOrder,
t1.MStockCode,
Count(t1.MStockCode) AS SCCount,
t1.MStockDes,
Count(t1.MStockDes) AS DescCount
FROM PorMasterDetail t1
INNER JOIN PorMasterHdr t2
ON t1.PurchaseOrder = t2.PurchaseOrder
WHERE Year(t2.OrderEntryDate) = Year(Getdate())
AND Month(t2.OrderEntryDate) = Month(Getdate())
GROUP BY t1.PurchaseOrder,
t1.MStockCode,
t1.MStockDes
HAVING Count(t1.MStockCode) > 1
Using responses I came up with the following
select * from
(
SELECT COUNT(dbo.InvMaster.StockCode) AS Count, dbo.InvMaster.StockCode AS StockCodes,
dbo.PorMasterDetail.PurchaseOrder, dbo.PorMasterHdr.OrderEntryDate
FROM dbo.InvMaster INNER JOIN dbo.PorMasterDetail ON
dbo.InvMaster.StockCode = dbo.PorMasterDetail.MStockCode
INNER JOIN dbo.PorMasterHdr ON dbo.PorMasterDetail.PurchaseOrder = dbo.PorMasterHdr.PurchaseOrder
WHERE YEAR(dbo.PorMasterHdr.OrderEntryDate) = YEAR(GETDATE())
GROUP BY dbo.InvMaster.StockCode, dbo.InvMaster.StockCode,
dbo.PorMasterDetail.PurchaseOrder, dbo.PorMasterHdr.OrderEntryDate
) Count
Where Count.Count > 1
This returns the below , which is starting to be a bit more helpful
In result line 2,3,4 we can see the same stock code (*30044) ordered 3 times on different
purchase orders.
I guess the question is, is it possible to look at If something was ordered more than once within say a 30 day period.
Is this possible?
Count StockCodes PurchaseOrder OrderEntryDate
2 *12.0301.0021 322959 2014-09-08
2 *30044 320559 2014-01-21
8 *30044 321216 2014-03-26
4 *30044 321648 2014-05-08
5 *32317 321216 2014-03-26
4 *4F-130049/TEST 323353 2014-10-22
5 *650-1157/E 322112 2014-06-24
2 *650-1757 321226 2014-03-27
SELECT *
FROM
(
SELECT h.OrderEntryDate, d.*,
COUNT(*) OVER (PARTITION BY d.MStockCode) DupeCount
FROM
PorMasterHdr h
INNER JOIN PorMasterDetail d ON
d.PurchaseOrder = h.PurchaseOrder
WHERE
-- first day of current month
-- http://blog.sqlauthority.com/2007/05/13/sql-server-query-to-find-first-and-last-day-of-current-month/
h.OrderEntryDate >= CONVERT(VARCHAR(25), DATEADD(dd,-(DAY(GETDATE())-1),GETDATE()),101)
) dupes
WHERE
dupes.DupeCount > 1;
This should work if you're only deduping on stock code. I was a little unclear if you wanted to dedupe on both stock code and stock desc, or either stock code or stock desc.
Also I was unclear on your return columns because it almost looks like you're wanting to pivot the columns so that both purchase order numbers appear on the same line.

Is this the only way to filter the right table in a left outer join?

I have customer balances stored in their own table. the customer balances table gets a new set of records every day (reflecting the balance that day) but contains balances for other days (yyyy-mm-dd). I wanted to get all UK customers from accountinformation and their balances yesterday from balances. I wanted to include rows from accountinformation even where there is no corresponding record (for yesterday) in balances...
select firstname,lastname,accountnumber,balance from accountinformation i
left outer join balances b
on i.accountnumber = b.account
where country = 'UK' and status = 'OPEN'
and (b.date = '2014-04-10' or b.date is null)
... it did not satisfy the requirement to show rows from accountinformation if there is no corresponding row in balances. I had to write the query like this...
select firstname,lastname,accountnumber,balance from accountinformation i
left outer join (select * from balances where date = '2014-04-10') b
on i.accountnumber = b.account
where country = 'UK' and status = 'OPEN'
.. to get the desired behavour. In the interests of correctness I want to know if there is a more correct way to filter the left table in a left outer join?
you might be able to do
select firstname,lastname,accountnumber,balance from accountinformation i
left outer join balances b
on i.accountnumber = b.account and b.date = '2014-04-10'
where country = 'UK' and status = 'OPEN'

Resources