Sql Server - Get SUM() of values for only Active Users - sql-server

I have a requirement where i need to get Total Active Employees and Total Sales by RegionId
My query result should be like below.
RegionId | TotalEmployees | TotalSales | Average
1 10 100 10
2 3 15 5
My front end application will pass all the RegionIds as a single string separated by a comma, my query parameter is of type VARCHAR() and the Input paramter will look like '1,2,3,4,7,14,26' and there can be upto 20 Region Ids in a single string separated by a comma.
SELECT E.[RegionId] as RegionId
,COUNT(E.[EmployeeId) AS TotalEmployees
,(SELECT SUM([Sale])
FROM dbo.[Sales]
WHERE RegionId = R.[RegionId]
) AS TotalSales
,TotalSales/TotalEmployees AS Average
FROM dbo.[Employee]
JOIN [dbo].[ufn_StringSplit](#RegionIdCollection, ',') RegionId
ON E.RegionId = CAST(RegionId.[Data] AS Varchar(5000))
WHERE E.[Active] = 1
GROUP BY E.[RegionId]
My Employee table structures look alike below
EmployeeId | Name | RegionId | Active
100 Tom 2 1
101 Jim 4 0
103 Ben 2 1
Sales Table
SaleId | EmployeeId| RegionId | Sale
1 100 2 3500
2 101 4 2000
3 100 2 1500
Now my issue is when i am getting TotalSales the below query gets all the sales by RegionId, but i need to get All the sales done by only current Active employees in the Employee table
(SELECT SUM([Sale])
FROM dbo.[Sales]
WHERE RegionId = R.[RegionId]
) AS TotalSales

There is no reason to use a sub-select to find the sum of sales here, that will result in running that query for each and every row. You want to aproach this in a set based way which means you need to join and group appropriately:
with s as
(
select e.RegionId
,e.EmployeeId
,sum(s.Sale) as EmployeeSales
from dbo.ufn_StringSplit(#RegionIdCollection, ',') as r
join dbo.Employee as e
on r.RegionId = CAST(r.[Data] AS varchar(20)) -- Do you really need 5000 characters here?
left join dbo.Sales as s
on r.RegionId = s.RegionId
and e.EmployeeId = s.EmployeeId
where e.Active = 1
group by e.RegionId
,e.EmployeeId
)
select s.RegionId
,count(s.EmployeeId) as TotalEmployees
,sum(s.EmployeeSales) as TotalSales
,sum(s.EmployeeSales)/count(s.EmployeeId) as Average
from s
group by s.RegionId

Related

Joining multiple tables and applying aggegate giving wrong result

I have a business case where when a CountryId is passed to my proc, I need to get all the Regions where the Business is set up in that country, All the Active Sales Employees working in that Region, Total sales done by the current active sales employees in that region.
My Region table look like below.
RegionId | Name | CountryId
100 A 1
101 B 4
103 C 1
SalesEmployee Table
Id | EmployeeId | RegionId
1 250 100
2 255 101
3 289 101
Employee Table
EmployeeId | Active
250 1
255 1
289 0
314 1
Sales table
SaleId | EmployeeId| RegionId | Sale
1 100 2 3500
2 101 4 2000
3 100 2 1500
My below query is giving me the correct TotalSales value but the TotalUsers count doesn't match.
Select R.[RegionId], COUNT(SE.[UserId]) AS TotalUsers, SUM(S.[Sales]) AS TotalSales
FROM dbo.[Region] R
INNER JOIN [SalesEmployee] SE
ON R.[RegionId] = SE.[RegionId]
INNER JOIN dbo.[Employee] E
ON E.[EmployeeId] = SE.[EmployeeId]
LEFT JOIN dbo.[Sales] S
ON S.[EmployeeId] = E.[EmployeeId]
WHERE R.[CountryId] = 12 AND E.[Active] = 1
GROUP BY R.[RegionId]
For Ex RegionId 100 has only 7 Active sales employees currently working but the result gives me 89, in my Employee table there can be many more users but few of them can be inactive and few of them may be working in another department, to make sure that the employee is sales employee the employee needs to be present in SalesEmployee table and to check if the Employee is Active I need to check in Employee table.
The problem is 1 single user can have multiple entries on his name in sales table, so when i am joining with Sales table which has multiple entries on a single user then the TotalEmployees count is going up.
So its actually a easy small fix.
Select R.[RegionId], COUNT(DISTINCT(SE.[UserId])) AS TotalUsers, SUM(S.[Sales]) AS TotalSales
FROM dbo.[Region] R
INNER JOIN [SalesEmployee] SE
ON R.[RegionId] = SE.[RegionId]
INNER JOIN dbo.[Employee] E
ON E.[EmployeeId] = SE.[EmployeeId]
LEFT JOIN dbo.[Sales] S
ON S.[EmployeeId] = E.[EmployeeId]
WHERE R.[CountryId] = 12 AND E.[Active] = 1
GROUP BY R.[RegionId]
This small change will give you what you want.
Changing COUNT(SE.[UserId]) to COUNT(DISTINCT(SE.[UserId])) is all you need.

Joining two tables and need to have MAX aggregate function in ON clause

This is my code! I want to give a part id and purchase order id to my report and it brings all the related information with those specification. The important thing is that, if we have same purchase order id and part id we need the code to return the result with the highest transaction id. The following code is not providing what I expected. Could you please help me?
SELECT MAX(INVENTORY_TRANS.TRANSACTION_ID), INVENTORY_TRANS.PART_ID
, INVENTORY_TRANS.PURC_ORDER_ID, TRACE_INV_TRANS.QTY, TRACE_INV_TRANS.CREATE_DATE, TRACE_INV_TRANS.TRACE_ID
FROM INVENTORY_TRANS
JOIN TRACE_INV_TRANS ON INVENTORY_TRANS.TRANSACTION_ID = TRACE_INV_TRANS.TRANSACTION_ID
WHERE INVENTORY_TRANS.PART_ID = #PartID
AND INVENTORY_TRANS.PURC_ORDER_ID = #PurchaseOrderID
GROUP BY TRACE_INV_TRANS.QTY, TRACE_INV_TRANS.CREATE_DATE, TRACE_INV_TRANS.TRACE_ID, INVENTORY_TRANS.PART_ID
, INVENTORY_TRANS.PURC_ORDER_ID
The sample of trace_inventory_trans table is :
part_id trace_id transaction id qty create_date
x 1 10
x 2 11
x 3 12
the sample of inventory_trans table is :
transaction_id part_id purc_order_id
11 x p20
12 x p20
I wanted to have the result of biggest transaction which is transaction 12 but it shows me transaction 11
I would use a sub-query to find the MAX value, then join that result to the other table.
The ORDER BY + TOP (1) returns the MAX value for transaction_id.
SELECT
inv.transaction_id
,inv.part_id
,inv.purc_order_id
,tr.qty
,tr.create_date
,tr.trace_id
FROM
(
SELECT TOP (1)
transaction_id,
part_id,
purc_order_id
FROM
INVENTORY_TRANS
WHERE
part_id = #PartID
AND
purc_order_id = #PurchaseOrderID
ORDER BY
transaction_id DESC
) AS inv
JOIN
TRACE_INV_TRANS AS tr
ON inv.transaction_id = tr.transaction_id;
Results:
+----------------+---------+---------------+------+-------------+----------+
| transaction_id | part_id | purc_order_id | qty | create_date | trace_id |
+----------------+---------+---------------+------+-------------+----------+
| 12 | x | p20 | NULL | NULL | 3 |
+----------------+---------+---------------+------+-------------+----------+
Rextester Demo

Optimization of SQL Server Query

My problem is that I created a query that takes too long to execute.
City | Department | Employee | Attendance Date | Attendance Status
------------------------------------------------------------------------
C1 | Dept 1 | Emp 1 | 2016-01-01 | ABSENT
C1 | Dept 1 | Emp 2 | 2016-01-01 | LATE
C1 | Dept 2 | Emp 3 | 2016-01-01 | VACANCY
So I want to create a view that contains same data and adds a column that contains the total number of employees (that serves me later in a SSRS project to determine the percentage of each status).
So I created a function that makes simple select filtering by department and date.
and this is the query that uses the function:
SELECT City, Department, Employee, [Attendence Date], [Attendance Status], [Get Department Employees By Date](Department, [Attendence Date]) AS TOTAL
FROM attendenceTable
This is the function:
CREATE FUNCTION [dbo].[Get Department Employees By Date]
(
#deptID int = null,
#date datetime = null
)
RETURNS nvarchar(max)
AS
BEGIN
declare #result int = 0;
select #result = count(*) from attendenceTable where DEPT_ID = #deptID and ATT_DATE_G = #date;
RETURN #result;
END
The problem is that query takes too long (I mean very long time) to execute.
Any Suggestion of optimization?
Your function is a scalar function, which is run once for every row in the result set (~600,000) times, and is a known performance killer. It can be rewritten into an inline table-valued function, or if the logic is not required elsewhere, a simple group, count & join would suffice:
WITH EmployeesPerDeptPerDate
AS ( SELECT DEPT_ID ,
ATT_DATE_G ,
COUNT(DISTINCT Employee) AS EmployeeCount
FROM attendenceTable
GROUP BY DEPT_ID ,
ATT_DATE_G
)
SELECT A.City ,
A.Department ,
A.Employee ,
A.[Attendence Date] ,
A.[Attendance Status] ,
ISNULL(B.EmployeeCount, 0) AS EmployeeCount
FROM attendenceTable AS A
LEFT OUTER JOIN EmployeesPerDeptPerDate AS B ON A.DEPT_ID = B.DEPT_ID
AND A.ATT_DATE_G = B.ATT_DATE_G;

Get value with MAX(date) from two table

I have two tables.
MainTable:
MainID | LastValue | LastReadingDate
1 | 234 | 01.01.2012
2 | 534 | 03.02.2012
Readings:
MainID | ValueRead | ReadingDate
1 | 123 | 03.02.2012
1 | 488 | 04.03.2012
2 | 324 | 03.02.2012
2 | 683 | 05.04.2012
I want to get
SELECT MainTable.MainID, MainTable.LastValue, MainTable.LastReadingDate, (SELECT ValueRead, MAX(ReadingDate)
FROM Readings
WHERE Readings.MainID=MainTable.MainID ORDER BY ValueRead)
In other words, I want to get the current LastValue and LastReadingDate from MainTable along side the ValueRead with the most recent ReadingDate from Readings.
Here is a query you could use. It'll show all MainTable entries, including those that doesn't have a "Reading" entry yet. Change the LEFT JOIN to an INNER JOIN if you don't want it like that.
WITH LastReads AS (
SELECT ROW_NUMBER() OVER (PARTITION BY MainID ORDER BY ReadingDate DESC) AS ReadingNumber,
MainID,
ValueRead,
ReadingDate
FROM Readings
)
SELECT M.MainID, M.LastValue, M.LastReadingDate, R.ValueRead, R.ReadingDate
FROM MainTable M
LEFT OUTER JOIN LastReads R
ON M.MainID = R.MainID
AND R.ReadingNumber = 1 -- Last reading, use 2 or 3 to get the 2nd newest, 3rd newest, etc.
SQLFiddle-link: http://sqlfiddle.com/#!3/16c68/3
Another link with N number of readings per mainid: http://sqlfiddle.com/#!3/16c68/4
Not tried this myself, but here goes. Please try
select max(r.readingdate), max(t.lastvalue), max(t.lastreadingdate)
from readings r inner join
( select MainID, LastValue, LastReadingDate
from MainTable m
where LastReadingDate =
(select max(minner.LastReadingDate)
from MainTable minner
where minner.MainID = m.MainID
)
) t
on (r.mainid = t.mainid)
try this:
select M.LastValue, M.LastReadingDate,
(select top 1 ValueRead from Readings where MainID=M.MainID order by ReadingDate desc)
from MainTable M

SQL Server problems when using ODER BY and DISTINCT together

I got two problems, the first problem is my two COUNTS that I start with. GroupID is a string that keep products together (Name_Year together), same product but different size.
If I have three reviews in tblReview and they all have the same GroupID I want to return 3. My problem is that if I have three Products with different ProductID but same GroupID and I add three Review to that GroupID I got 9 returns (3*3). If I only have one Product With the same GroupID and three Reviews it works (1*3=3 returns)
The Second problem is that if I have the ORDER BY CASE Price I have to add GROUP BY Price as well and then I don't get the DISTINCT effect that I want. And that is to just show products that have unique GroupID.
Here's the query, hope somebody can help me with this.
ALTER PROCEDURE GetFilterdProducts
#CategoryID INT, #ColumnName varchar(100)
AS
SELECT COUNT(tblReview.GroupID) AS ReviewCount,
COUNT(tblComment.GroupID) AS CommentCount,
Product.ProductID,
Product.Name,
Product.Year,
Product.Price,
Product.BrandID,
Product.GroupID,
AVG(tblReview.Grade) AS Grade
FROM Product LEFT JOIN
tblComment ON Product.GroupID = tblComment.GroupID LEFT JOIN
tblReview ON Product.GroupID = tblReview.GroupID
WHERE (Product.CategoryID = #CategoryID)
GROUP BY Product.ProductID, Product.BrandID, Product.GroupID, Product.Name, Product.Year, Product.Price
HAVING COUNT(distinct Product.GroupID) = 1
ORDER BY
CASE
WHEN #ColumnName='Name' THEN Name
WHEN #ColumnName='Year' THEN Year
WHEN #ColumnName='Price' THEN Price
END
My tabels:
Product:
ProductID, Name, Year, Price, BrandID, GroupID
tblReview:
ReviewID, Description, Grade, ProductID, GroupID
tblComment:
CommentID, Description, ProductID, GroupID
I think that my problem is that if I have three GroupID with the same name, ex Nike_2010 in Product and I have three Reviews in tblReview that counts the first row in Products that contain Nike_2010 counts how many reviews in tblReview with the same GroupID, Nike_2010 and then the second row in Product that contains Nike_2010 and then do the same count again and again, that results to 9 rows. How do I avoid that?
For starters, because you're joining on multiple tables, you're going to end up with the cross product of all of them as a result. Your counts will then return the total count of rows containing data in that column. Consider the following example:
- PRODUCTS - -- COMMENTS -- --- REVIEWS ---
Key | Name Key | Comment Key | Review
1 | A 1 | Foo 1 | Great
2 | B 1 | Bar 1 | Wonderful
The query
SELECT PRODUCTS.Key, PRODUCTS.Name, COMMENTS.Comment, REVIEWS.Review
FROM PRODUCTS
LEFT OUTER JOIN COMMENTS ON PRODUCTS.KEY = COMMENTS.KEY
LEFT OUTER JOIN REVIEWS ON PRODUCTS.KEY = REVIEWS.KEY
will result in the following data:
Key | Name | Comment | Review
1 | A | Foo | Great
1 | A | Foo | Wonderful
1 | A | Bar | Great
1 | A | Bar | Wonderful
2 | B | NULL | NULL
Thus, counting in this format
SELECT PRODUCTS.Key, PRODUCTS.Name, COUNT(COMMENTS.Comment), COUNT(REVIEWS.Review)
FROM PRODUCTS
LEFT OUTER JOIN COMMENTS ON PRODUCTS.KEY = COMMENTS.KEY
LEFT OUTER JOIN REVIEWS ON PRODUCTS.KEY = REVIEWS.KEY
GROUP BY PRODUCTS.Key, PRODUCTS.Name
will give you
Key | Name | Count1 | Count2
1 | A | 4 | 4
2 | B | 0 | 0
because it's counting each row in the table produced by the join!
Instead, you want to count each table separately in a subquery before joining it back like the following:
SELECT PRODUCTS.Key, PRODUCTS.Name, ISNULL(CommentCount.NumComments, 0),
ISNULL(ReviewCount.NumReviews, 0)
FROM PRODUCTS
LEFT OUTER JOIN (SELECT Key, COUNT(*) as NumComments
FROM COMMENTS
GROUP BY Key) CommentCount on PRODUCTS.Key = CommentCount.Key
LEFT OUTER JOIN (SELECT Key, COUNT(*) as NumReviews
FROM REVIEWS
GROUP BY Key) ReviewCount on PRODUCTS.Key = ReviewCount.Key
which will produce the following
Key | Name | NumComments | NumReviews
1 | A | 2 | 2
2 | B | 0 | 0
As for the "DISTINCT effect" you refer to, I'm not exactly sure I follow. Could you elaborate a bit?
About second problem - cannot you group by same CASE statement? You shouldn't have Price field in results list then though.

Resources