number of customers per number of invoices

number of customers per number of invoices - sql-server

I have a table containing
customerID, InvoiceID, ProductID, Date, Income,
I need to count the number of clients by the number of invoices
I need to write a query that returns something like
Invoice amount ------ number of client that have that amount of invoices
1 ------------------------ 4
2 ------------------------ 3
4 ------------------------ 7
Here's what I've tried
SELECT COUNT(DISTINCT customerID) AS 'Number of Clients',
COUNT(InvoiceID) AS 'Number of Invoices'
FROM Sheet
GROUP BY COUNT(InvoiceID)
ORDER BY COUNT(InvoiceID)
but I can't use aggregate field in group by

You need a nested select to first calculate the number of invoices per customer. The outer select can then calculate the number of customers for each invoice count.
SELECT [Number of Invoices], COUNT(*) AS [Number of Clients]
FROM (
SELECT customerID, COUNT(DISTINCT InvoiceID) AS [Number of Invoices]
FROM Sheet
GROUP BY customerID
) A
GROUP BY [Number of Invoices]
ORDER BY [Number of Invoices]
A common table expression can also be used in place of the nested select:
;WITH CTE AS (
SELECT customerID, COUNT(DISTINCT InvoiceID) AS [Number of Invoices]
FROM Sheet
GROUP BY customerID
)
SELECT [Number of Invoices], COUNT(*) AS [Number of Clients]
FROM CTE
GROUP BY [Number of Invoices]
ORDER BY [Number of Invoices]
See this db<>fiddle for a demo or this one that contains your originally posted data (but a different result).
(The above has been updated to include DISTINCT to COUNT(DISTINCT InvoiceID) to handle multiple rows with the same customerID and InvoiceID in the originally posted data.)

Related

SQL - Return a value sum only once when grouped

I want to count the unique record of a string but grouping by dates, and if the string already appeared previously on a group it shouldn't be counted anymore.
I've tried using distinct and it does show the unique count of the record but the record is counted again on every month.
Actual and minified SQL query:
select
date,
count(distinct d.name) as count
from ...
group by date
Sample and desired output
Image

Grab unique names and tag them with the earliest date. At that point it's just a matter of regrouping the resulting rows by date. Each name will uniquely correspond to only one date as desired:
with data as (select name, min("date") as dt from T group by name)
select dt, count(name) as cnt from data group by dt;
If you still need to see the original dates even when no names are counted, then flag each row according to whether it should be counted and then count the flags per date:
with data as (
select *,
case when "date" = min("date") over (partition by name)
then 1 end as flag
from T
)
select "date", count(flag) as cnt
from data
group by "date";

So you want the name only count once:
SELECT COUNT(u.name) as name_count, u.[date]
FROM (
SELECT d.name,MIN(d.date) AS [date]
FROM yourTable d
GROUP BY d.name) u
GROUP BY u.[date];

You can add a ROW_NUMBER() that is Partitioned by name and ordered by date and add a WHERE clause that only returns the rows with Row_Number = 1.

You can check this following option-
SELECT A.Date,COUNT(B.[Name]) Count
FROM
(
SELECT DISTINCT Date FROM your_table
)A
LEFT JOIN
(
SELECT * FROM
(
SELECT *,ROW_NUMBER() OVER(PARTITION BY [Name] ORDER BY Date) RN
FROM your_table
)A WHERE RN = 1
)B ON A.Date = B.Date
GROUP BY A.Date
But the best option if I modify a bit the concept from Shawnt00 is as below-
SELECT A.Date,COUNT(B.[Name]) Count
FROM
(
SELECT DISTINCT Date FROM your_table
)A
LEFT JOIN
(
SELECT [Name],MIN(Date) Date FROM your_table GROUP BY [Name]
)B ON A.Date = B.Date
GROUP BY A.Date
Both case the output will be-
Date Count
20190101 2
20190201 0
20190301 1

Sum up invoice amounts based upon IDs that exist in multiple columns

Suppose I have a table which contains invoice lines by row and 6 columns where an employee ID can be tagged to that invoice. Employee IDs cannot be duplicated in a row, however the same employee ID can exist in different columns for different invoice lines. In the table below, REP 1 should have a total amount of 500.
I want to be able to sum up the total amounts by employee ID(REP 1, REP 2, etc..). I can do this with a large union query, but the issue is that I have a list of about 450 employee IDs that I need to sum up. Is there a way I can have one query spit out a list of employee IDs and their total amounts?

I would suggest using cross apply:
select v.e, sum(t.amount)
from t cross apply
(values (slot1), (slot2), (slot3), (slot4), (slot5), (slot6)) v(e)
group by v.e;
Note: This assumes that you are using SQL Server.

Joining the tables is one way...
select e.employee_id,
sum(i.amount)
from invoice i,
join employee e on (e.employee_id in (i.slot1, i.slot2, i.slot3, i.slot4, i.slot5, i.slot6))
group by e.employee_id;
An alternative way...
select employee_id,
sum(amount)
from (
select slot1 employee_id, sum(amount) amount from invoice group by slot1 union
select slot2, sum(amount) from invoice group by slot2 union all
select slot3, sum(amount) from invoice group by slot3 union all
select slot4, sum(amount) from invoice group by slot4 union all
select slot5, sum(amount) from invoice group by slot5 union all
select slot6, sum(amount) from invoice group by slot6
) as q1
group by employee_id;

SQL select top 10 for each year

I have a fact database from which I want to make a trendline based on top 10 items based on sum quantity for each item per year.
I've done the following, but it does for example select more than 10 entities for my year 2007:
select TOP 10 sum(Quantity) as Quantity,DIM_Time.Year, DIM_Item.Name as Name
from Fact_Purchase
join DIM_Item on DIM_Item.BKey_ItemId = Fact_Purchase.DIM_Item
join DIM_Time on DIM_Time.ID = Fact_Purchase.DIM_Time_DeliveryDate
where Fact_Purchase.DIM_Company = 2 and DIM_Time.ID = FACT_Purchase.DIM_Time_DeliveryDate
Group by dim_item.Name, DIM_Time.Year
Order by Quantity DESC
How do I select top 10 items with the highest quantity through all my years, with only 10 top entities for each year?
As you can guess, the company is individual, and Is going to be a parameter in my report

I think this is what you're going for. My apologies if I messed up on translating your tables across.
select *
from (
select DIM_Time.[Year], dim_item.Name, SUM(Quantity) Quantity, RANK() OVER (PARTITION BY DIM_Time.[Year] ORDER BY SUM(Quantity) DESC) salesrank
from Fact_Purchase
join DIM_Item on DIM_Item.BKey_ItemId = Fact_Purchase.DIM_Item
join DIM_Time on DIM_Time.ID = Fact_Purchase.DIM_Time_DeliveryDate
where Fact_Purchase.DIM_Company = 2 and DIM_Time.ID = FACT_Purchase.DIM_Time_DeliveryDate
group by dim_item.Name, DIM_Time.[Year]
) tbl
where salesrank <= 10
order by [Year], salesrank
The subquery groups by name/year, and the RANK() OVER part sets up a sort of row index that increments by SUM(Quantity) and restarts for each Year. From there you just have to filter out anything with a salesrank (index) that's over 10.

SELECT
_year,
Name,
_SUM,
RANK_iD
FROM
(
SELECT
_year,
Name,
_SUM,
DENSE_RANK()OVER(PARTITION BY _year,_Month ORDER BY _SUM DESC) AS RANK_iD
FROM(
Select
DIM_Time AS _year,
DIM_Item as Name,
sum(Quantity) AS _SUM
from
#ABC
GROUP BY
_year,
Name
)A
)B
WHERE RANK_iD<=10

Running total query in select statement without views

I have to query a set of running total data by month.
e.g.
Month Amount Total
2014-01-01 100 100
2014-01-02 100 200
2014-01-03 100 300
The application does not allow to create a view or SP. It is able to select data from a table directly.
e.g.
select Month,
Amount,
Total -- This is my problem.
from Table -- This is a table only.
Any ideas are welcome, thank you.

You can use OUTER APPLY:
SELECT T.Month,T.Amount,T2.Total
FROM Table1 T
OUTER APPLY
( SELECT Total = SUM(Amount)
FROM Table1 T2
WHERE T2.Month <= T.Month
) T2;
Or a correlated subquery:
SELECT T.Amount,
( SELECT Amount = SUM(Amount)
FROM Table1 T2
WHERE T2.Month <= T.Month
)
FROM Table1 T

The easiest way is to use SQL Server 2012 because it has cumulative sum built-in:
select Month, Amount,
sum(Amount) over (order by Month) as Total -- This is my problem.
from Table;
The correlated subquery method follows a similar structure:
select Month, Amount,
(select sum(Amount) from table t2 where t2.Month <= t.Month) as Total
from Table t;
These are usually the two methods that I would consider, because both are standard SQL. As Vignesh points out you can do it with cross apply as well (although as I write this, his query is not correct).

Here is a second way to create a running total:
SELECT t.month, t.amount,
SUM(t.amount) OVER(PARTITION BY t.month ORDER BY t.month
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as [Total]
FROM [yourTable] AS t

HAVING Clause in SQL Server is Ignored

I have a PURCHASES table that has a unique ID of PURCHASE_ID. I want to break down how many distinct customers purchases they had per year that had at least 2 unique purchases. This is the query that I wrote:
SELECT
YEAR(PURCHASE_TIME) PURCHASE_YEAR,
COUNT(DISTINCT CUSTOMER_ID) TOTAL_CUSTOMERS
FROM PURCHASES
GROUP BY YEAR(PURCHASE_TIME)
HAVING COUNT(PURCHASE_ID) > 1
However, this query is giving me the total distinct patients per year of purchase no matter how many purchases they had. Meaning, that I am getting customers that had only 1 purchase for the year AND those that had more than one. It is as if the HAVING clause is being ignored.
It doesn’t change anything if I use a HAVING COUNT(DISTINCT PURCHASE_ID) > 1 either. Even though I shouldn’t technically need that since the PURCHASE_ID is already unique and is a primary key.
This works though.
SELECT
PURCHASE_YEAR,
COUNT(DISTINCT CUSTOMER_ID) TOTAL_CUSTOMERS
FROM
(
SELECT
YEAR(PURCHASE_TIME) PURCHASE_YEAR,
CUSTOMER_ID
FROM PURCHASES
GROUP BY YEAR(PURCHASE_TIME),CUSTOMER_ID
HAVING COUNT(PURCHASE_ID) > 1
) VW
GROUP BY PURCHASE_YEAR

try this:
SELECT PURCHASE_YEAR,
COUNT(1) AS CNT
FROM
(SELECT YEAR(PURCHASE_TIME) PURCHASE_YEAR,
CUSTOMER_ID
FROM PURCHASES
GROUP BY YEAR(PURCHASE_TIME),
CUSTOMER_ID
HAVING COUNT(1) > 1) AS CNT
GROUP BY PURCHASE_YEAR
ORDER BY PURCHASE_YEAR

Try the following:
SELECT
YEAR(PURCHASE_TIME) PURCHASE_YEAR,
COUNT(CUSTOMER_ID) TOTAL_CUSTOMERS
FROM PURCHASES
GROUP BY YEAR(PURCHASE_TIME), CUSTOMER_ID
HAVING COUNT(PURCHASE_ID) > 1

Try adding DISTINCT to your filter:
SELECT
YEAR(PURCHASE_TIME) PURCHASE_YEAR,
COUNT(DISTINCT CUSTOMER_ID) TOTAL_CUSTOMERS
FROM PURCHASES
GROUP BY YEAR(PURCHASE_TIME)
HAVING COUNT(DISTINCT PURCHASE_ID) > 1