Wrong count result - sql-server

Wrong count result - sql-server

I have 2 tables in sql server:
Employee (Job Title, Hire Date)
Person (FirstName, LastName)
And I need to return the E.JobTitle, E.HireDate,P.FirstName ,P.LastName along with the how many employees have the same job title.
so I used this query:
SELECT E.JobTitle,
E.HireDate,
P.FirstName,
P.LastName,
COUNT(E.JobTitle)
FROM AdventureWorks2019.HumanResources.Employee E
JOIN AdventureWorks2019.Person.Person P ON E.BusinessEntityID = P.BusinessEntityID
GROUP BY E.JobTitle,
E.HireDate,
P.FirstName,
P.LastName;
the problem is that the query returns 1 per the count, while I'd expect to get per each row the count of the num employees with that job title.
my question is how can I get the correct count?

As mentioned by #Larnu in the comments, your issue is that you're grouping by LastName also, so you are going to get a count of 1 per LastName.
You need a windowed count with OVER, not an aggregated one with GROUP BY
No extra joins or grouping needed.
SELECT E.JobTitle,
E.HireDate,
P.FirstName,
P.LastName,
COUNT(*) OVER (PARTITION BY E.JobTitle)
FROM AdventureWorks2019.HumanResources.Employee E
JOIN AdventureWorks2019.Person.Person P ON E.BusinessEntityID = P.BusinessEntityID;

You should count the number of employee for each job title first in a subquery or cte.
SELECT
JobTitle,
COUNT(*) AS Count
FROM HumanResources.Employee
GROUP BY JobTitle
Then you can join the subquery or the cte with your original query.
WITH cte
AS
(
SELECT
JobTitle,
COUNT(*) AS Count
FROM HumanResources.Employee
GROUP BY JobTitle
)
SELECT
E.JobTitle,
E.HireDate,
P.FirstName,
P.LastName,
cte.Count
FROM
HumanResources.Employee E
INNER JOIN Person.Person P ON P.BusinessEntityID = E.BusinessEntityID
INNER JOIN cte on cte.JobTitle = E.JobTitle
ORDER BY
E.JobTitle,
P.LastName,
P.FirstName
;

Related

Select columns from several tables with count

I have 3 tables in SQL Server:
Sales (customerId)
Customer (customerId, personId)
Person (personId, firstName, lastName)
and I need to return the top 10 customers.
I used this query:
SELECT TOP 10
CustomerID, COUNT(CustomerID)
FROM
Sales
GROUP BY
(CustomerID)
ORDER BY
COUNT(CustomerID) DESC
The query currently returns only the customerId and count, but I also need to return the firstName and lastName of these customers from the Person table.
I know I need to reach the firstName and lastName by correlating between Sales.customerId and Customer.customerId, and from Customer.personId to get the Person.personId.
My question is whether I need to use an inner join or union, and how to use either of them to get the firstName and lastName of these customers

Union is mostly used for disjoint sets. To achieve your target, u can go with inner-join.
If you want to use joins, then here is the query which works similarly to your requirement.
SELECT TOP 10 S.CustomerID, P.FirstName,P.LastName, count(*)
FROM Sales S
INNER JOIN Customer C on S.CustomerId=C.CustomerId
INNER JOIN Person P on C.PersonId = P.PersonId
GROUP BY (S.CustomerID, P.FirstName,P.LastName)
ORDER BY count(*) DESC

You need use inner join like this :
SELECT TOP 10 S.CustomerID
, P.FirstName
, P.LastName
, COUNT (1) AS CountOfCustomer -- this is equal count(*)
FROM Sales S
INNER JOIN Customer C ON S.CustomerId = C.CustomerId
INNER JOIN Person P ON C.PersonId = P.PersonId
GROUP BY S.CustomerID, P.FirstName, P.LastName
ORDER BY 4 DESC; -- this is equal order by count(*)

How to filter a subquery?

This is based on AdventureWorks sample database.
And here is my query:
SELECT FirstName, LastName,
(SELECT COUNT(SalesOrderID)
FROM SalesOrderHeader
WHERE SalesOrderHeader.ContactID = Contact.ContactID) AS OrderCount
FROM Contact
ORDER BY OrderCount desc
Question: what should I add to my query so it only shows OrderCount that is more than 20? Nothing else should change about my output. I tried this and it did not work:
WHERE SalesOrderHeader.ContactID = Contact.ContactID AS OrderCount AND COUNT(SalesOrderID) > 20

Rewrite as a join aggregation query and then use HAVING:
SELECT
c.FirstName,
c.LastName,
COUNT(s.SalesOrderID) AD OrderCount
FROM Contacts c
INNER JOIN SalesOrderHeader s
ON c.ContactID = s.ContactID
GROUP BY
c.FirstName,
c.LastName
HAVING
COUNT(s.SalesOrderID) > 20;

T-SQL Grouping Issue

I have 2 tables.
Contacts
ContactID pk
EmailAddress
FirstName
LastName
Address
Orders
OrderID pk
ContactID fk
I want to get the number or orders for each email address in Contacts like below
select
Contacts.EmailAddress,
count(distinct Orders.OrderID) AS NumOrders
from
Contacts inner join Orders on Contacts.ContactID = Orders.ContactID
group by
Contacts.EmailAddress
Problem is, I also want the first name, last name, address. But I can't group by those because each email address in Contacts could have a different first name, lastname or address associated with it.
ie:
myname#email.com, Fred, Jackson, 123 Main St
myname#email.com, Bob, Smith, 456 Spruce St.
How can I change my query so that I can get the first name, last name and address for the most recent entry made in Contacts for that email address?
Thanks in advance!

My first thought would be to use windowed functions.
SELECT EmailAddress,
FirstName,
Lastname,
[Address],
EmailOrderCount
FROM (SELECT c.EmailAddress,
c.FirstName,
c.LastName,
c.[Address],
COUNT(o.OrderID) OVER (PARTITION BY c.EmailAddress) EmailOrderCount,
ROW_NUMBER() OVER (PARTITION BY c.EmailAddress ORDER BY c.ContactID DESC) Rn
FROM Contacts c
JOIN Orders o ON c.ContactID = o.ContactID
) t
WHERE Rn = 1
Demo
another way would be to use CROSS APPLY to append the top 1 contact record to the summary rows.
SELECT c.EmailAddress,
COUNT(o.OrderID) NumOrders,
ca.FirstName,
ca.LastName,
ca.[Address]
FROM Contacts c
INNER JOIN Orders ON c.ContactId = o.ContactID
CROSS APPLY (
SELECT TOP 1
FirstName,
Lastname,
[Address]
FROM Contacts c2
WHERE c2.EmailAddress = c.EmailAddress
ORDER BY c2.ContactID DESC) ca
GROUP BY c.EmailAddress,
ca.FirstName,
ca.LastName,
ca.[Address]

Try this:
select
Contacts.Name,
Contacts.FirstName,
Contacts.LastName
Contacts.EmailAddress,
count(distinct Orders.OrderID) AS NumOrders
from
(
select max(ContactID) as ContactID,
EmailAddress
from Contacts
group by EmailAddress
) MinContactForEachEMailAddress
inner join
Contacts
on MinContactForEachEMailAddress.ContactID = Contacts.ContactID
inner join
Orders
on Contacts.ContactID = Orders.ContactID
group by
Contacts.EmailAddress

Another way to get what you want is using a CTE and taking the "maximum" row by using ROW_NUMBER.
;WITH CTE AS (
SELECT C.ContactId, C.Name, C.FirstName, C.LastName, C.EmailAddress,
ROW_NUMBER() OVER (PARTITION BY EmailAddress ORDER BY ContactId DESC) RowNo
FROM Contact C
)
SELECT CTE.*, COUNT(o.OrderID) OVER (PARTITION BY CTE.EmailAddress) Cnt
FROM CTE
JOIN Orders O on CTE.ContactID = O.ContactID
-- select the "maximum" row
WHERE CTE.RowNo = 1

An easy way to do this is to make your original query a subquery and select from it. I'm making a slight change, because it's a better practice to group by your primary key than your email address. (Is it a safe bet that each contact has just one email address, and that the basic intent is to group by person?) If so, try this:
SELECT DISTINCT c.EmailAddress, c.FirstName, c.LastName, c.Address, sub.NumOrders
FROM
(
select
Contacts.ContactID
count(distinct Orders.OrderID) AS NumOrders
from
Contacts inner join Orders on Contacts.ContactID = Orders.ContactID
group by
Contacts.ContactID
) sub
JOIN Contacts c
ON sub.ContactID = c.ContactID
If you really need to group by email address instead, then change the above subquery to your original query and change c.EmailAddress to sub.EmailAddress. Of course you may order the SELECT fields however best suits you.
Edit follows:
The ContactID must be a sequence number and you can continually put the same person in the table. So if you add the DISTINCT keyword in the outer query I believe that will give you what you need.

select count over partition by

I am learning window functions in sql server. I am using AdventrueWorks2012 database for practice. I want to calculate total number of sales and purchases for each item in the store.
The classic solution can be like
SELECT ProductID,
Quantity,
(SELECT Count(*)
FROM AdventureWorks.Purchasing.PurchaseOrderDetail
WHERE PurchaseOrderDetail.ProductID = p.ProductID) TotalPurchases,
(SELECT Count(*)
FROM AdventureWorks.Sales.SalesOrderDetail
WHERE SalesOrderDetail.ProductID = p.ProductID) TotalSales
FROM (SELECT DISTINCT ProductID,
Quantity
FROM AdventureWorks.Production.ProductInventory) p
Trying to convert to window functions gives me wrong results:
SELECT DISTINCT d.ProductID,
Quantity,
Count(d.ProductID)
OVER(
PARTITION BY d.ProductID) TotalPurchases,
Count(d2.ProductID)
OVER(
PARTITION BY d2.ProductID) TotalSales
FROM (SELECT DISTINCT ProductID,
Quantity
FROM AdventureWorks.Production.ProductInventory) p
INNER JOIN AdventureWorks.Purchasing.PurchaseOrderDetail d
ON p.ProductID = d.ProductID
INNER JOIN AdventureWorks.Sales.SalesOrderDetail d2
ON p.ProductID = d2.ProductID
ORDER BY d.ProductID
Why this is wrong? How can I correct it?

You should change INNER JOIN to LEFT JOIN
Because when you inner join, result will miss productid which from ProductInventory table does not have PurchaseOrderDetail or SalesOrderDetail.

only 1 value returns with sql query

AdventureWorks2012 DB - I am trying to return top 1 or 2 emlpoyees from Finance dept and Engineer dept who have worked longest. I cant get my query to return both, only results from engineering show. Any suggestions?
SELECT TOP 2 EDH.StartDate, E.BusinessEntityID, D.Name, EDH.EndDate, DATEDIFF(hour,EDH.StartDate, GETDATE()) AS HoursWorked
FROM HumanResources.Employee E
INNER JOIN Person.Person PP ON E.BusinessEntityID = PP.BusinessEntityID
INNER JOIN HumanResources.EmployeeDepartmentHistory EDH ON E.BusinessEntityID = EDH.BusinessEntityID
INNER JOIN HumanResources.Department D ON D.DepartmentID = EDH.DepartmentID
WHERE (D.Name LIKE 'Finance' OR D.Name = 'Engineering')
AND EDH.EndDate IS NULL
GROUP BY D.Name, EDH.StartDate,E.BusinessEntityID,EDH.EndDate
ORDER BY EDH.StartDate ASC

Your problem is that the employees from Engineering happen to have started before the employees from Finance. The ORDER BY is affecting all of your records (both departments), and then the TOP 2 value is grabbing the two most recent employees, regardless of departments.
If you are trying to write a query that returns the first employee from each department, you're going to have to get a bit more complex. Here is an example that uses the ROW_NUMBER() function to order employees within each department by their start date, then filters those records to only return employees who are the first individuals in their department.
SELECT
StartDate,
BusinessEntityID,
Name,
EndDate,
HoursWorked
FROM
(
SELECT
EDH.StartDate,
E.BusinessEntityID,
D.Name,
EDH.EndDate,
DATEDIFF(hour,EDH.StartDate, GETDATE()) AS HoursWorked,
ROW_NUMBER() OVER (PARTITION BY D.Name ORDER BY EDH.StartDate) AS RowNumberWithinDepartment
FROM
HumanResources.Employee E
INNER JOIN
Person.Person PP ON E.BusinessEntityID = PP.BusinessEntityID
INNER JOIN
HumanResources.EmployeeDepartmentHistory EDH ON E.BusinessEntityID = EDH.BusinessEntityID
INNER JOIN
HumanResources.Department D ON D.DepartmentID = EDH.DepartmentID
WHERE
(D.Name LIKE 'Finance' OR D.Name = 'Engineering') AND
EDH.EndDate IS NULL
GROUP BY D.Name, EDH.StartDate,E.BusinessEntityID,EDH.EndDate
) x
WHERE RowNumberWithinDepartment = 1
ORDER BY StartDate ASC