Get top X percentage based on cumulative sum - sql-server

My table looks like this:
ID | ItemID | ItemQualityID | Amount | UnitPrice
My goal is to find the top x% rows for each ItemID + ItemQualityID pair based on Amount cumulative sum and ordered by UnitPrice.
For example:
ID | ItemID | ItemQualityID | Amount | UnitPrice
1 1 1 18 2
2 1 1 1 1
3 1 1 1 1
4 2 1 18 2
5 2 1 1 1
6 2 1 1 1
7 1 1 1 3
and I want the top 10%, then the resulting table should contain row #2, 3, 5, 6. Since the total amount for ItemID 1 and 2 are 21 and 20 respectively, thus 10% would be 2 items each. If I want the top 20%, the resulting table should still be the same since if I include row 1 and 4 it would make it 100%. Row #7 has unit price > row #1 so if row #1 is not included then row #7 shouldn't be included as well.
Ideally I want the table with all the filtered rows for some other calculations but I will be happy even if I can only get the sum of Amount * UnitPrice of the filtered table. Something like
ItemID | ItemQualityID | Sum
1 1 2
2 1 2
for the above example.

You can use SUM OVER :
DECLARE #percent DECIMAL(5, 2) = .1
;WITH CteSum AS(
SELECT *,
TotalSum = SUM(Amount) OVER(PARTITION BY ItemID, ItemQualityID),
CumSum = SUM(Amount) OVER(PARTITION BY ItemID, ItemQualityID ORDER BY UnitPrice, ID)
FROM tbl
)
SELECT
ItemID,
ItemQualityID,
[Sum] = SUM(Amount * UnitPrice)
FROM CteSum
WHERE CumSum <= #percent * TotalSum
GROUP BY ItemID, ItemQualityID
ONLINE DEMO

Related

Count number of rows for each id to create a new column

ID
value
1
4
1
5
3
4
2
10
I want to add another column called count, that has for each id the number of observations.
Transformed table
id
value
count
1
4
2
1
5
2
3
4
1
2
10
1
You can use the OVER() clause to aggregate.
SELECT
ID,
value,
[count] = COUNT(*) OVER (PARTITION BY ID)
FROM dbo.TableName;

Calculate Total per pax excluding the row that has 0 in the column

I have 2 tables, Tbl1 and Tbl2 :
Tbl1:
ID Col1 Col2 Sold Total
1 AA 0 100
1 BB CC 2 200
1 DD EE 3 300
2 FF GG 1 100
Tbl2:
ID Sold Total TotalPerPax
I need to calculate the TotalPerPax in Tbl2 depending on the ID But the calculation of the TotalPerPax is like this. Example:
ID = 1
Sold: 0 + 2 + 3 = 5
Total = 100 + 200 + 300 = 600
TotalPerPax = (Total minus the Total of the row that has 0 sold / Sold )
(600 -100 ) / 5 = 500
The output should look like this
Tbl2:
ID Sold Total TotalPerPax
1 5 600 100 -- (500 Total / 5 Sold)
2 1 100 100
So far I have this:
When executing it throws an error "Divide by zero error encountered" thus I can't compute the totalPerPax correctly. Can anyone can help me to with this? Thanks
SELECT ID,
Col1
Col2,
Sold,
Total,
SUM(COALESCE(Total, 0))/SUM(COALESCE(Sold, 0)) As TotalPerPax
FROM Tbl1 t1
Where ID = 1
GROUP BY ID, Col1, Col2,Sold, Total
Sample sql fiddle: http://sqlfiddle.com/#!18/09971/2
aI would phrase this as:
SELECT
ID,
SUM(Sold) AS Sold,
SUM(Total) AS Total,
CASE WHEN SUM(Sold) > 0
THEN SUM(CASE WHEN Sold > 0 THEN Total ELSE 0 END) /
SUM(CASE WHEN Sold > 0 THEN Sold ELSE 0 END)
ELSE 0 END AS TotalPerPax
FROM TBl1
GROUP BY ID;
Demo
The CASE expression for TotalPerPax uses logic which does not include any total or sold amount when the latter happens to be zero. As a note, for any ID which only might have zero sold amounts, TotalPerPax would be reported as zero.

How to fix Aggregation in Group By, missing aggregation values

I have a table of sales info, and am interested in Grouping by customer, and returning the sum, count, max of a few columns. Any ideas please.
I checked all the Select columns are included in the Group By statement, a detail is returned not the Groupings and aggregate values.
I tried some explicit naming but that didn't help.
SELECT
customerID AS CUST,
COUNT([InvoiceID]) AS Count_Invoice,
SUM([Income]) AS Total_Income,
SUM([inc2015]) AS Tot_2015_Income,
SUM([inc2016]) AS Tot_2016_Income,
MAX([prodA]) AS prod_A,
FROM [table_a]
GROUP BY
customerID, InvoiceID,Income,inc2015, inc2016, prodA
There are multiple rows of CUST, i.e. there should be one row for CUST 1, 2 etc.... it should say this...
---------------------------------------------
CUST Count_Invoice Total_Income Tot_2015_Income Tot_2016_Income prod_A
1 2 600 300 300 2
BUT IT IS RETURNING THIS
======================================
CUST Count_Invoice Total_Income Tot_2015_Income Tot_2016_Income prod_A
1 1 300 300 0 1
1 1 300 0 300 1
2 1 300 0 300 1
2 1 500 0 500 0
3 2 800 0 800 0
3 1 300 0 300 1
You don't need to group by other columns, since they are already aggregating by count, min, max or sum.
So you may try this
SELECT customerID as CUST
,count([InvoiceID]) as Count_Invoice
,sum([Income]) as Total_Income
,sum([inc2015]) as Tot_2015_Income
,sum([inc2016]) as Tot_2016_Income
,max([prodA]) as prod_A --- here you are taking Max but in output it seems like sum
FROM [table_a]
Group By customerID
Note: For column prod_A you are using max which gives 1 but in result it is showing 2 which is actually sum or count. Please check.
for more info you may find this link of Group by.
From the description of your expected output, you should be aggregating by customer alone:
SELECT
customerID A CUST,
COUNT([InvoiceID]) AS Count_Invoice,
SUM([Income]) AS Total_Income,
SUM([inc2015]) AS Tot_2015_Income,
SUM([inc2016]) AS Tot_2016_Income,
MAX([prodA]) AS prod_A
FROM [table_a]
GROUP BY
customerID;

Select All combinations of rows for a*b=total <=X

I have a situation about writing a query to find and insert into table B all combinations of rows from table A, where the condition is:
a x b=total from row1
c x d=total from row2 ...etc where count(total)<=X
"a" price of item
"b" quantity of item
Idea is to have all combinations like example
For 100$ dollars i can buy:
2 tshirt, 1 jacket, 1 pants
or
1 tshirt, 2 jacket, 1 pants
...etc
Creating a cursor will help me run the query for each row, but how to split the number in col.quantity in the same time ?
I will first write what I understood,
we would have a table of items, each item would have a price,
we have an amount of money and we want to buy as many as possible
items
we want the items to have the same weight as the two examples
provided "2 tshirt, 1 jacket, 1 pants or 1 tshirt, 2 jacket, 1 pants"
did not specify a solution with one item but tried to use all the
items.
So how to determine the Qty for each item to utilize most of the money that we have.
I think this can be described in a different way to be more clear, like for example:- one person goes in a shop and would like to buy each of the items available but if he has some more money left he want to know what other items he can buy with it. if the items are not a lot and the money is not a lot, this can be easy, but if the items are a lot and the money a lot too, I can see that this may be a problem. so lets find a solution.
Declare #Items Table (
Item varchar(250),Price decimal
)
insert into #Items values
('tshirt',30)
,('jacket',30)
,('pants' ,10)
--,('shoe' ,15) ---extra items for testing
--,('socks',5) ---extra items for testing
Declare #total int=100 -- your X
Declare #ItemsCount int
Declare #flag int
Declare #ItemsSum decimal
Declare #AllItmsQty int
select #ItemsCount=count(*),#ItemsSum=sum(price),#flag=POWER(2,count(*)) From #Items
select #AllItmsQty=#total/cast(#ItemsSum as int)
;with Numbers(n) as (
--generat numbers from 1,2,3,... #flag
select 1 union all
select (n+1) n from Numbers where n<#flag
),ItemsWithQty as (
select *,Price*n [LineTotal] from #Items,Numbers
),Combination as (
select items.*,Numbers.n-1 [CombinationId] from #Items items,Numbers
),CombinationWithSeq as (
select *
,ROW_NUMBER() over (Partition by [CombinationId] order by [CombinationId]) [seq]
from Combination
),CombinationWithSeqQty as (
select *,case when (CombinationId & power(2,seq-1))>0 then 1 else 0 end +#AllItmsQty [qty]
from CombinationWithSeq
),CombinationWithSeqQtySubTotal as (
select *,Price*qty [SubTotal] from CombinationWithSeqQty
)
select
--CombinationId,
sum(subtotal) [Total],
replace(
replace(
STRING_AGG(
case when (Qty=0) then 'NA' else (cast(Qty as varchar(5))+' '+Item)
end
,'+')
,'+NA','')
,'NA+','') [Items]
from CombinationWithSeqQtySubTotal
group by CombinationId
having sum(subtotal)<=#total
The result would be as follow:-
Total Items
===== ===========================
100 2 tshirt+1 jacket+1 pants
100 1 tshirt+2 jacket+1 pants
80 1 tshirt+1 jacket+2 pants
70 1 tshirt+1 jacket+1 pants
if I add the other two items we would get
Total Items
===== ===========================
100 1 tshirt+1 jacket+2 pants+1 shoe+1 socks
95 1 tshirt+1 jacket+1 pants+1 shoe+2 socks
90 1 tshirt+1 jacket+1 pants+1 shoe+1 socks
ok so the query is giving the final result not the table B, that you described to have a x b or item price multiplied by qty and sub total , well we can display that one very easily by filtering witch combination we selected, if we are selecting the first one that would be the nearest to the amount we can change the last part of the query to show table B you need.
),CombinationWithSeqQtySubTotal as (
select *,Price*qty [SubTotal] from CombinationWithSeqQty
),Results as (
select
CombinationId,
sum(subtotal) [Total],
replace(
replace(
STRING_AGG(
case when (Qty=0) then 'NA' else (cast(Qty as varchar(5))+' '+Item)
end
,'+')
,'+NA','')
,'NA+','') [Items]
from CombinationWithSeqQtySubTotal
group by CombinationId
having sum(subtotal)<=#total
--order by [Total] desc
)
select item, price, qty, SubTotal from CombinationWithSeqQtySubTotal t where t.CombinationId in
(select top(1) CombinationId from Results order by [Total] desc)
The result would be as below:-
item price qty SubTotal
===== ===== === =======
tshirt 30 1 30
jacket 30 1 30
pants 10 2 20
shoe 15 1 15
socks 5 1 5
or if we run it with only the items you provided the result would be as below:-
item price qty SubTotal
====== === === =======
tshirt 30 2 60
jacket 30 1 30
pants 10 1 10
if we dont want to use 'STRING_AGG' or we dont have it, we can manage its same function by adding some CTE's that will do the same job, as the 'STRING_AGG' was only combining the results in a (qty + item + comma), so the below solution may help.
Declare #Items Table (Item varchar(250),Price decimal)
insert into #Items values
('tshirt',30)
,('jacket',30)
,('pants' ,10)
--,('shoes' ,15) ---extra items for testing
--,('socks',5) ---extra items for testing
Declare #total int=100 -- your X
Declare #ItemsCount int
Declare #flag int
Declare #ItemsSum decimal
Declare #AllItmsQty int
select #ItemsCount=count(*),#ItemsSum=sum(price),#flag=POWER(2,count(*)) From #Items
select #AllItmsQty=#total/cast(#ItemsSum as int)
;with Numbers(n) as (
--generat numbers from 1,2,3,... #flag
select 1 union all
select (n+1) n from Numbers where n<#flag
),ItemsWithQty as (
select *,Price*n [LineTotal] from #Items,Numbers
),Combination as (
select items.*,Numbers.n-1 [CombinationId] from #Items items,Numbers
),CombinationWithSeq as (
select *,ROW_NUMBER() over (Partition by [CombinationId] order by [CombinationId]) [seq] from Combination
),CombinationWithSeqQty as (
select *,case when (CombinationId & power(2,seq-1))>0 then 1 else 0 end +#AllItmsQty [qty] from CombinationWithSeq
),CombinationWithSeqQtySubTotal as (
select *,Price*qty [SubTotal] from CombinationWithSeqQty
),CombinationWithTotal as (
--to find only the combinations that are less or equal to the Total
select
CombinationId,
sum(subtotal) [Total]
from CombinationWithSeqQtySubTotal
group by CombinationId
having sum(subtotal)<=#total
),DetailAnswer as (
select s.*,t.Total,cast(s.qty as varchar(20))+' ' +s.Item QtyItem from CombinationWithTotal t
inner join CombinationWithSeqQtySubTotal s on s.CombinationId=t.CombinationId
),DetailAnswerFirst as (
select *,cast(QtyItem as varchar(max)) ItemList from DetailAnswer t where t.seq=1
union all
select t.*,cast((t.QtyItem+'+'+x.ItemList) as varchar(max)) ItemList from DetailAnswer t
inner join DetailAnswerFirst x on x.CombinationId=t.CombinationId and x.seq+1=t.seq
)
select CombinationId,Total,ItemList from DetailAnswerFirst where seq=#ItemsCount order by Total desc
--select * from DetailAnswer --remark the above line and unremark this one for the details that you want to go in Table B
if any of the assumptions are wrong or if you need some description I would be happy to help.
Maybe the easiest way to get the possible combinations is via self-joins and joins to numbers.
If you want combinations of 3, then use 3 self-joins.
And 3 joins to a number table or CTE for each joined "Items" table.
The way the ON criteria are used, is to minimize the impact of all that joining.
You could also take the SQL from the COMBOS CTE, and use it to first insert it into a temporary table.
For example:
declare #PriceLimit decimal(10,2) = 100;
WITH COMBOS AS
(
SELECT
i1.id as id1, i2.id as id2, i3.id as id3,
n1.n as n1, n2.n as n2, n3.n as n3,
(n1.n + n2.n + n3.n) AS TotalItems,
(i1.Price * n1.n + i2.Price * n2.n + i3.Price * n3.n) as TotalCost
FROM Items i1
JOIN Items i2 ON i2.id > i1.id AND i2.Price < #PriceLimit
JOIN Items i3 ON i3.id > i2.id AND i3.Price < #PriceLimit
JOIN Nums n1
ON n1.n between 1 and FLOOR(#PriceLimit/i1.Price)
AND (i1.Price * n1.n) < #PriceLimit
JOIN Nums n2
ON n2.n between 1 and FLOOR(#PriceLimit/i2.Price)
AND (i1.Price * n1.n + i2.Price * n2.n) < #PriceLimit
JOIN Nums n3
ON n3.n between 1 and FLOOR(#PriceLimit/i3.Price)
AND (i1.Price * n1.n + i2.Price * n2.n + i3.Price * n3.n) <= #PriceLimit
AND (i1.Price * n1.n + i2.Price * n2.n + i3.Price * (n3.n+1)) > #PriceLimit
WHERE i1.Price < #PriceLimit
)
SELECT
c.TotalItems, c.TotalCost,
CONCAT (c.n1,' ',item1.Name,', ',c.n2,' ',item2.Name,', ',c.n3,' ',item3.Name) AS ItemList
FROM COMBOS c
LEFT JOIN Items item1 ON item1.id = c.id1
LEFT JOIN Items item2 ON item2.id = c.id2
LEFT JOIN Items item3 ON item3.id = c.id3
ORDER BY c.TotalCost desc, c.TotalItems desc, c.id1, c.id2, c.id3;
A test on db<>fiddle here
Test result:
TotalItems | TotalCost | ItemList
---------- | --------- | ---------------------------
7 | 100.00 | 1 pants, 1 tshirt, 5 socks
6 | 100.00 | 1 jacket, 1 tshirt, 4 socks
6 | 100.00 | 1 pants, 2 tshirt, 3 socks
5 | 100.00 | 1 jacket, 1 pants, 3 socks
5 | 100.00 | 1 jacket, 2 tshirt, 2 socks
5 | 100.00 | 1 pants, 3 tshirt, 1 socks
5 | 100.00 | 2 pants, 1 tshirt, 2 socks
3 | 90.00 | 1 jacket, 1 pants, 1 tshirt

Find and replace rows with similar value in one column in Oracle SQL

I want to find the rows which are similar to each other, and replace them with a new row. My table looks like this:
OrderID | Price | Minimum Number | Maximum Number | Volume
1 45 2 10 250
2 46 2 10 250
3 60 2 10 250
"Similar" in this context means that the rows that have same Maximum Number, Minimum Number, and Volume. Prices can be different, but the difference can be at most 2.
In this example, orders with OrderID of 1 and 2 are similar, but 3 is not (since even if it has same Minimum Number, Maximum Number, and Volume, its price is not within 2 units from orders 1 and 2).
Then, I want orders 1 and 2 be replaced by a new order, let's say OrderID 4, which has same Minimum Number and Maximum Number. Its Volume hass to be sum of volumes of the orders it is replacing. Its price can be the Price of any of the previous orders that will be deleted in the output table (45 or 46 in this example). So, the output for the example above would be:
OrderID | Price | Minimum Number | Maximum Number | Volume
4 45 2 10 500
3 60 2 10 250
Here is a way to do this in SQL Server 2012 or Oracle. The idea is to use lag() to find where groups should begin and end and then aggregate.
select min(id) as id, min(price) as price, MinimumNumber, MaximumNumber, sum(Volume)
from (select t.*,
sum(case when prev_price < price - 2 then 1 else 0 end) over
(partition by MinimumNumber, MaximumNumber, Volume order by price) as grp
from (select t.*,
lag(price) over (partition by MinimumNumber, MaximumNumber, Volume
order by price
) as prev_price
from table t
) t
) t
group by grp, price, MinimumNumber, MaximumNumber;
The only issue is the setting of the id. I'm not sure what the exact rule is for that.

Resources