I have a PURCHASES table that has a unique ID of PURCHASE_ID. I want to break down how many distinct customers purchases they had per year that had at least 2 unique purchases. This is the query that I wrote:
SELECT
YEAR(PURCHASE_TIME) PURCHASE_YEAR,
COUNT(DISTINCT CUSTOMER_ID) TOTAL_CUSTOMERS
FROM PURCHASES
GROUP BY YEAR(PURCHASE_TIME)
HAVING COUNT(PURCHASE_ID) > 1
However, this query is giving me the total distinct patients per year of purchase no matter how many purchases they had. Meaning, that I am getting customers that had only 1 purchase for the year AND those that had more than one. It is as if the HAVING clause is being ignored.
It doesn’t change anything if I use a HAVING COUNT(DISTINCT PURCHASE_ID) > 1 either. Even though I shouldn’t technically need that since the PURCHASE_ID is already unique and is a primary key.
This works though.
SELECT
PURCHASE_YEAR,
COUNT(DISTINCT CUSTOMER_ID) TOTAL_CUSTOMERS
FROM
(
SELECT
YEAR(PURCHASE_TIME) PURCHASE_YEAR,
CUSTOMER_ID
FROM PURCHASES
GROUP BY YEAR(PURCHASE_TIME),CUSTOMER_ID
HAVING COUNT(PURCHASE_ID) > 1
) VW
GROUP BY PURCHASE_YEAR
try this:
SELECT PURCHASE_YEAR,
COUNT(1) AS CNT
FROM
(SELECT YEAR(PURCHASE_TIME) PURCHASE_YEAR,
CUSTOMER_ID
FROM PURCHASES
GROUP BY YEAR(PURCHASE_TIME),
CUSTOMER_ID
HAVING COUNT(1) > 1) AS CNT
GROUP BY PURCHASE_YEAR
ORDER BY PURCHASE_YEAR
Try the following:
SELECT
YEAR(PURCHASE_TIME) PURCHASE_YEAR,
COUNT(CUSTOMER_ID) TOTAL_CUSTOMERS
FROM PURCHASES
GROUP BY YEAR(PURCHASE_TIME), CUSTOMER_ID
HAVING COUNT(PURCHASE_ID) > 1
Try adding DISTINCT to your filter:
SELECT
YEAR(PURCHASE_TIME) PURCHASE_YEAR,
COUNT(DISTINCT CUSTOMER_ID) TOTAL_CUSTOMERS
FROM PURCHASES
GROUP BY YEAR(PURCHASE_TIME)
HAVING COUNT(DISTINCT PURCHASE_ID) > 1
Related
I have a table containing
customerID, InvoiceID, ProductID, Date, Income,
I need to count the number of clients by the number of invoices
I need to write a query that returns something like
Invoice amount ------ number of client that have that amount of invoices
1 ------------------------ 4
2 ------------------------ 3
4 ------------------------ 7
Here's what I've tried
SELECT COUNT(DISTINCT customerID) AS 'Number of Clients',
COUNT(InvoiceID) AS 'Number of Invoices'
FROM Sheet
GROUP BY COUNT(InvoiceID)
ORDER BY COUNT(InvoiceID)
but I can't use aggregate field in group by
You need a nested select to first calculate the number of invoices per customer. The outer select can then calculate the number of customers for each invoice count.
SELECT [Number of Invoices], COUNT(*) AS [Number of Clients]
FROM (
SELECT customerID, COUNT(DISTINCT InvoiceID) AS [Number of Invoices]
FROM Sheet
GROUP BY customerID
) A
GROUP BY [Number of Invoices]
ORDER BY [Number of Invoices]
A common table expression can also be used in place of the nested select:
;WITH CTE AS (
SELECT customerID, COUNT(DISTINCT InvoiceID) AS [Number of Invoices]
FROM Sheet
GROUP BY customerID
)
SELECT [Number of Invoices], COUNT(*) AS [Number of Clients]
FROM CTE
GROUP BY [Number of Invoices]
ORDER BY [Number of Invoices]
See this db<>fiddle for a demo or this one that contains your originally posted data (but a different result).
(The above has been updated to include DISTINCT to COUNT(DISTINCT InvoiceID) to handle multiple rows with the same customerID and InvoiceID in the originally posted data.)
I have two columns of interest in a table. They are 'PersonID' which is a unique ID for each individual and a 'Gender' column.
My table has multiple rows for each individual as each row reflects a policy they have under their name.
However, for each individual I have noticed some data quality issues, whereby their gender has been recorded as 'Male' for some of the rows and then 'Female' for others
How do I compile a list of all the rows that show the 'PersonIDs' that have been recorded as both Male and Female?
So an example of how my table looks at the moment is like this:
4439; 1
4439; 1
4439; 1
4439; 1
4439; 1
4439; 0
4439; 0
where 4439=PersonId and 1=Male (0=female)
I just want a list gives me all the PersonIDs that have both 1 and 0 in the gender field
Update:
SELECT DISTINCT personid FROM(
SELECT PersonID
, RANK() OVER (partition by personid order by gender) as rnk
FROM table t
) i
WHERE i.rnk > 1
instead of below as #Dan pointed out.
SELECT PersonID
FROM table t
WHERE t.gender = '0' OR t.gender = '1'
GROUP BY PersonID
HAVING COUNT(*) > 1
If you need additional info:
SELECT p.Name, p.lastName, p.DOB
FROM table p
WHERE p.PersonID IN
(
SELECT PersonID
FROM table t
WHERE t.gender = '0' OR t.gender = '1'
GROUP BY PersonID
HAVING COUNT(*) > 1
) i
This resolves you,
SELECT ID FROM POLICYTABLE
WHERE GENDER=0 OR GENDER=1
GROUP BY ID
HAVING COUNT(*)>1
Try this.
SELECT PersonID
FROM
(
SELECT PersonID,Gender
FROM table
GROUP BY PersonID,Gender
) as DATA
GROUP BY PersonID
HAVING COUNT(*) > 1
You simply need to count the number of distinct gender codes on each person:
SELECT PersonID
FROM MyTable
GROUP BY PersonID
HAVING COUNT(DISTINCT Gender) > 1
I have a fact database from which I want to make a trendline based on top 10 items based on sum quantity for each item per year.
I've done the following, but it does for example select more than 10 entities for my year 2007:
select TOP 10 sum(Quantity) as Quantity,DIM_Time.Year, DIM_Item.Name as Name
from Fact_Purchase
join DIM_Item on DIM_Item.BKey_ItemId = Fact_Purchase.DIM_Item
join DIM_Time on DIM_Time.ID = Fact_Purchase.DIM_Time_DeliveryDate
where Fact_Purchase.DIM_Company = 2 and DIM_Time.ID = FACT_Purchase.DIM_Time_DeliveryDate
Group by dim_item.Name, DIM_Time.Year
Order by Quantity DESC
How do I select top 10 items with the highest quantity through all my years, with only 10 top entities for each year?
As you can guess, the company is individual, and Is going to be a parameter in my report
I think this is what you're going for. My apologies if I messed up on translating your tables across.
select *
from (
select DIM_Time.[Year], dim_item.Name, SUM(Quantity) Quantity, RANK() OVER (PARTITION BY DIM_Time.[Year] ORDER BY SUM(Quantity) DESC) salesrank
from Fact_Purchase
join DIM_Item on DIM_Item.BKey_ItemId = Fact_Purchase.DIM_Item
join DIM_Time on DIM_Time.ID = Fact_Purchase.DIM_Time_DeliveryDate
where Fact_Purchase.DIM_Company = 2 and DIM_Time.ID = FACT_Purchase.DIM_Time_DeliveryDate
group by dim_item.Name, DIM_Time.[Year]
) tbl
where salesrank <= 10
order by [Year], salesrank
The subquery groups by name/year, and the RANK() OVER part sets up a sort of row index that increments by SUM(Quantity) and restarts for each Year. From there you just have to filter out anything with a salesrank (index) that's over 10.
SELECT
_year,
Name,
_SUM,
RANK_iD
FROM
(
SELECT
_year,
Name,
_SUM,
DENSE_RANK()OVER(PARTITION BY _year,_Month ORDER BY _SUM DESC) AS RANK_iD
FROM(
Select
DIM_Time AS _year,
DIM_Item as Name,
sum(Quantity) AS _SUM
from
#ABC
GROUP BY
_year,
Name
)A
)B
WHERE RANK_iD<=10
let's say employee table has employee details and deptId of the employee.
to get the number of employees in each deptid,
select deptId, COUNT(*) from employee group by deptId;
question is: to get the deptId having max number of employees of the above result set,
select Top 1 deptId, COUNT(*) from employee group by deptId order by 2 desc
(2-ref to second column in the query list) - will do.. but
Is there anyway to avoid ordering this set? or better way of writing this sql,
thanks
If you just want the MAX number of employees within a department, you can do this:
SELECT TOP 1 DepartmentID,
COUNT(EmployeeID)
FROM EmployeeTable
GROUP BY DepartmentID
ORDER BY COUNT(EmployeeID) DESC
Without any ordering, it is hard, but try
Select deptId, cnt
From (Select deptId, count(*) cnt
from employee
Group By deptId) Z
Where cnt = (Select Max(cnt)
From (Select deptId, count(*) cnt
From employee
Group By deptId) ZZ)
I have three tables, Customers, Sales and Products.
Sales links a CustomerID with a ProductID and has a SalesPrice.
select Products.Category, AVG(SalePrice) from Sales
inner join Products on Products.ProductID = Sales.ProductID
group by Products.Category
This lets me see the average price for all sales by category. However, I only want to include customers that have more than 3 sales records or more in the DB.
I am not sure the best way, or any way, to go about this. Ideas?
You haven't mentioned the customer data anywhere so I'll assume it's in the Sales table
You need to filter and restrict the Sales table first to the customers with more the 3 sales, then join to get product category and get the average across categories
select
Products.Category, AVG(SalePrice)
from
(SELECT ProductID, SalePrice FROM Sales GROUP BY CustomerID HAVING COUNT(*) > 3) S
inner join
Products on Products.ProductID = S.ProductID
group by
Products.Category
I'd try the following:
select Products.Category, AVG(SalePrice) from Sales s
inner join Products on Products.ProductID = s.ProductID
where
(Select Count(*) From Sales Where CustomerID = s.CustomerID) > 3
group by Products.Category
I'd create a pseudo-table of "big customer IDs" with a select, and then join it to your query to limit the results:
SELECT Products.Category, AVG(SalePrice) FROM Sales
INNER JOIN Products ON Products.ProductID = Sales.ProductID
INNER JOIN (
SELECT CustomerID FROM Sales WHERE COUNT(CustomerID) >= 3 GROUP BY CustomerID
) BigCustomer ON Sales.CustomerID = BigCustomer.CustomerID
GROUP BY Products.Category
Too lazy to test this out though, so let me know if it works ;o)
Another way
;WITH FilteredSales AS
(
SELECT Products.Category, Sales.SalesPrice, COUNT(Sales.CustomerId) OVER(PARTITION BY Sales.CustomerId) AS SaleCount
FROM Sales
INNER JOIN Products ON Products.ProductID = Sales.ProductID
)
select Category, AVG(SalePrice)
from FilteredSales
WHERE SaleCount > 3
group by Category