Joining multiple tables and applying aggegate giving wrong result - sql-server

I have a business case where when a CountryId is passed to my proc, I need to get all the Regions where the Business is set up in that country, All the Active Sales Employees working in that Region, Total sales done by the current active sales employees in that region.
My Region table look like below.
RegionId | Name | CountryId
100 A 1
101 B 4
103 C 1
SalesEmployee Table
Id | EmployeeId | RegionId
1 250 100
2 255 101
3 289 101
Employee Table
EmployeeId | Active
250 1
255 1
289 0
314 1
Sales table
SaleId | EmployeeId| RegionId | Sale
1 100 2 3500
2 101 4 2000
3 100 2 1500
My below query is giving me the correct TotalSales value but the TotalUsers count doesn't match.
Select R.[RegionId], COUNT(SE.[UserId]) AS TotalUsers, SUM(S.[Sales]) AS TotalSales
FROM dbo.[Region] R
INNER JOIN [SalesEmployee] SE
ON R.[RegionId] = SE.[RegionId]
INNER JOIN dbo.[Employee] E
ON E.[EmployeeId] = SE.[EmployeeId]
LEFT JOIN dbo.[Sales] S
ON S.[EmployeeId] = E.[EmployeeId]
WHERE R.[CountryId] = 12 AND E.[Active] = 1
GROUP BY R.[RegionId]
For Ex RegionId 100 has only 7 Active sales employees currently working but the result gives me 89, in my Employee table there can be many more users but few of them can be inactive and few of them may be working in another department, to make sure that the employee is sales employee the employee needs to be present in SalesEmployee table and to check if the Employee is Active I need to check in Employee table.
The problem is 1 single user can have multiple entries on his name in sales table, so when i am joining with Sales table which has multiple entries on a single user then the TotalEmployees count is going up.

So its actually a easy small fix.
Select R.[RegionId], COUNT(DISTINCT(SE.[UserId])) AS TotalUsers, SUM(S.[Sales]) AS TotalSales
FROM dbo.[Region] R
INNER JOIN [SalesEmployee] SE
ON R.[RegionId] = SE.[RegionId]
INNER JOIN dbo.[Employee] E
ON E.[EmployeeId] = SE.[EmployeeId]
LEFT JOIN dbo.[Sales] S
ON S.[EmployeeId] = E.[EmployeeId]
WHERE R.[CountryId] = 12 AND E.[Active] = 1
GROUP BY R.[RegionId]
This small change will give you what you want.
Changing COUNT(SE.[UserId]) to COUNT(DISTINCT(SE.[UserId])) is all you need.

Related

SQL Pivot / Case Query based on Row Value

Problem
Using SQL Server, I'm trying to pivot data based on values in a column. I want to move Bob and John's value column over if Salary is in the metric column.
Sample data:
Person table
Person ID
-------------
Bob 1
Bob 1
John 2
John 2
Value table
Metric Value ID
---------------------
Age 52 1
Salary 60000 1
Age 45 2
Salary 55000 2
Expected output
My goal is to pivot the table if salary is present in the Metric column.
Person Metric Value Salary ID
---------------------------------------
Bob Age 52 60000 1
John Age 45 55000 2
Current code:
SELECT *
FROM person_table pt, value_table vb
WHERE pt.id = vb.id
AND vb.metric IN ('Age', 'Salary')
Use the following pivot query:
SELECT
pt.Person,
'Age' AS Metric,
MAX(CASE WHEN vb.Metric = 'Age' THEN vb.Value END) AS Value,
MAX(CASE WHEN vb.Metric = 'Salary' THEN vb.Value END) AS Salary,
pt.ID
FROM person_table pt
INNER JOIN value_table vb
ON pt.id = vb.id
GROUP BY
pt.Person,
pt.ID
ORDER BY
pt.ID;

Joined tables - how to aggregate problem (sum)?

I have two tables joined like this:
SELECT count(DISTINCT T1.ContractNumber) AS nr_of_contracts,
count(T3.DateofInstallmentPayment) AS nr_of_paid_installents,
count(T3.DateofDueInstallment) AS nr_of_installments,
sum(T1.DisbursementAmount) AS disbursed_amount
FROM q.T1
LEFT JOIN q.T3
ON T1.ContractNumber=T3.ContractNumber
WHERE DateOfDisbursement BETWEEN '2019-12-01' AND '2019-12-31'
AND T3.DateofDueInstallment < GETDATE()
where T1 table contains data about clients (per contract number) and T3 about their payment schedules (per every instalment).
What I want is to have paid off amount (disbursement amount) of contracts from table T1 (aggregated by contract number) and not by every instalment. When I tried to select just sum(T1.Disbursement amount) then I receive sum but for all instalment which is incorrect.
T1:
Contract Number
DisbursementDate
Disbursement Amount
1
2019-12-01
1000
2
2019-12-01
2000
3
2019-12-01
3000
T3:
Contract Number
DateofDueInstallment
DateofInstallmentPayment
1
2020-01-01
2020-01-01
1
2020-02-01
2020-02-06
1
2020-03-01
2020-04-01
What I get after joining two tables for Contract Number = 1 is sum(DisbursementAmount) = 3000.
Contract Number
sum(DisbursementAmount)
1
3000
What I want after joining two tables for Contract Number = 1 is sum(DisbursementAmount) = 1000.
Contract Number
sum(DisbursementAmount)
1
1000
Something like this,not tested - a subquery with a different aggregation column
SELECT T1.product, T1.NrOfInstallment, count(DISTINCT T1.ContractNumber),
SELECT paid_amound FROM(SELECT ContractNumber, sum(tt.DisbursementAmount + (tt.ContractNumber*0.01))
- sum(tt.ContractNumber*0.01) as paid_amount
FROM abc.T1 as tt WHERE tt.ContractNumber = T1.ContractNumber GROUP BY ContractNumber) AS t)
FROM abc.T1
LEFT JOIN bde.T3
ON T1.ContractNumber=T3.ContractNumber
GROUP BY T1.product, T1.NrOfInstallment

Sql Server - Get SUM() of values for only Active Users

I have a requirement where i need to get Total Active Employees and Total Sales by RegionId
My query result should be like below.
RegionId | TotalEmployees | TotalSales | Average
1 10 100 10
2 3 15 5
My front end application will pass all the RegionIds as a single string separated by a comma, my query parameter is of type VARCHAR() and the Input paramter will look like '1,2,3,4,7,14,26' and there can be upto 20 Region Ids in a single string separated by a comma.
SELECT E.[RegionId] as RegionId
,COUNT(E.[EmployeeId) AS TotalEmployees
,(SELECT SUM([Sale])
FROM dbo.[Sales]
WHERE RegionId = R.[RegionId]
) AS TotalSales
,TotalSales/TotalEmployees AS Average
FROM dbo.[Employee]
JOIN [dbo].[ufn_StringSplit](#RegionIdCollection, ',') RegionId
ON E.RegionId = CAST(RegionId.[Data] AS Varchar(5000))
WHERE E.[Active] = 1
GROUP BY E.[RegionId]
My Employee table structures look alike below
EmployeeId | Name | RegionId | Active
100 Tom 2 1
101 Jim 4 0
103 Ben 2 1
Sales Table
SaleId | EmployeeId| RegionId | Sale
1 100 2 3500
2 101 4 2000
3 100 2 1500
Now my issue is when i am getting TotalSales the below query gets all the sales by RegionId, but i need to get All the sales done by only current Active employees in the Employee table
(SELECT SUM([Sale])
FROM dbo.[Sales]
WHERE RegionId = R.[RegionId]
) AS TotalSales
There is no reason to use a sub-select to find the sum of sales here, that will result in running that query for each and every row. You want to aproach this in a set based way which means you need to join and group appropriately:
with s as
(
select e.RegionId
,e.EmployeeId
,sum(s.Sale) as EmployeeSales
from dbo.ufn_StringSplit(#RegionIdCollection, ',') as r
join dbo.Employee as e
on r.RegionId = CAST(r.[Data] AS varchar(20)) -- Do you really need 5000 characters here?
left join dbo.Sales as s
on r.RegionId = s.RegionId
and e.EmployeeId = s.EmployeeId
where e.Active = 1
group by e.RegionId
,e.EmployeeId
)
select s.RegionId
,count(s.EmployeeId) as TotalEmployees
,sum(s.EmployeeSales) as TotalSales
,sum(s.EmployeeSales)/count(s.EmployeeId) as Average
from s
group by s.RegionId

Find two rows of a column belongs to same row of another column

I have a table where I need to find list of subjects that have students from same department without using a subquery or Join
I tried to do the having count of department but it does not provide the output.
SELECT A.Subject,
B.StudentID,
B.DEPTID
FROM AUTHOR A , ACADEMIC B
WHERE A.StudentID = B.StudentID
GROUP BY B.DEPT,
A.Subject,
B.StudentID
Gives me the table output
Subject StudentID DEPT
1 100 100
1 101 100
2 102 100
3 103 100
3 104 100
I expect the output to give me the subject that has studentID from same department without using subquery or JOIN.

SQL query to get all the data from different tables with same id

Sorry if this is too elemental but I cannot work it out. Don’t know how to search information on it either:
I have three tables:
Provider
id_provider name
---------- -----------
100 John
101 Sam
102 Peter
Contact
id_contact RowNo Email
---------- ----------- ----------------
100 1 john#work.com
100 2 john#gmail.com
101 1 sam#work.com
101 2 sam#yahoo.com
Product
Id_product RowNo Product
---------- ----------- ------------------------
100 1 John’s 1st product
100 2 John’s 2nd product
101 1 Sam’s 1st product
101 2 Sam’s 2nd product
101 3 Sam’s 3rd product
I need a query to show all the data from the three tables like this:
Id name id_contact RowNo Email Id_Product RowNo Product
100 John 100 1 john#work.com 100 1 John’s 1st product
100 John 100 2 john#gmail.com 100 2 John’s 2st product
101 Sam 101 1 sam#work.com 101 1 Sam's 1st product
101 Sam 101 2 sam#yahoo.com 101 2 Sam's 2nd product
101 Sam null null null 101 3 Sam's 3rd product
102 Peter null null null null null null
I am trying all the joins I know but I cannot make it work.
Thanks a lot
You can use the following query:
SELECT t1.id_provider AS Id, t1.name,
t2.id_contact, t2.cRowNo, t2.Email,
t2.Id_product, t2.Product
FROM Provider AS t1
LEFT JOIN (
SELECT COALESCE(id_contact, id_product) AS id,
c.id_contact, c.RowNo AS cRowNo, c.Email,
p.Id_product, p.Product, p.RowNo AS pRowNo
FROM Contact AS c
FULL JOIN Product AS p ON p.id_product = c.id_contact AND p.RowNo = c.RowNo
) AS t2 ON t1.id_provider = t2.id
The query does a FULL JOIN between Contact and Product tables and joins the table derived from the FULL JOIN to Provider table.
A FULL JOIN is required because we cannot know beforehand which of the two tables, Contact or Product, contains the most rows for each id.
select *
from Provider P1
left join Contact C2
on C2.id_contact = P1.id_provider
left join Product P2
on P2.id_product = P1.id_provider
SELECT prov.*,
c.*,
prod.*
FROM PROVIDER prov
LEFT JOIN Product prod ON prod.id_product = prov.id_provider
LEFT JOIN Contact c ON prov.id_provider = c.id_contact
AND prod.RowNo = c.RowNo
use left joins but join provider to product first then to contact
SQL Fiddle Demo

Resources