The highest value from list-distinct - sql-server

Can anyone help me with query, I have table
vendorid, agreementid, sales
12001 1004 700
5291 1004 20576
7596 1004 1908
45 103 345
41 103 9087
what is the goal ?
when agreemtneid >1 then show me data when sales is the highest
vendorid agreementid sales
5291 1004 20576
41 103 9087
Any ideas ?
Thx

Well you could try using a CTE and ROW_NUMBER something like
;WITH Vals AS (
SELECT *, ROW_NUMBER() OVER(PARTITION BY AgreementID ORDER BY Sales DESC) RowID
FROM MyTable
WHERE AgreementID > 1
)
SELECT *
FROM Vals
WHERE RowID = 1
This will avoid you returning multiple records with the same sale.
If that was OK you could try something like
SELECT *
FROM MyTable mt INNER JOIN
(
SELECT AgreementID, MAX(Sales) MaxSales
FROM MyTable
WHERE AgreementID > 1
) MaxVals ON mt.AgreementID = MaxVals.AgreementID AND mt.Sales = MaxVals.MaxSales

SELECT TOP 1 WITH TIES *
FROM MyTable
ORDER BY DENSE_RANK() OVER(PARTITION BY agreementid ORDER BY SIGN (SIGN (agreementid - 2) + 1) * sales DESC)
Explanation
We break table MyTable into partitions by agreementid.
For each partition we construct a ranking or its rows.
If agreementid is greater than 1 ranking will be equal to ORDER BY sales DESC.
Otherwise ranking for every single row in partition will be the same: ORDER BY 0 DESC.
See how it looks like:
SELECT *
, SIGN (SIGN (agreementid - 2) + 1) * sales AS x
, DENSE_RANK() OVER(PARTITION BY agreementid ORDER BY SIGN (SIGN (agreementid - 2) + 1) * sales DESC) AS rnk
FROM MyTable
+----------+-------------+-------+-------+-----+
| vendorid | agreementid | sales | x | rnk |
+----------|-------------|-------+-------+-----+
| 0 | 0 | 3 | 0 | 1 |
| -1 | 0 | 7 | 0 | 1 |
| 0 | 1 | 3 | 0 | 1 |
| -1 | 1 | 7 | 0 | 1 |
| 41 | 103 | 9087 | 9087 | 1 |
| 45 | 103 | 345 | 345 | 2 |
| 5291 | 1004 | 20576 | 20576 | 1 |
| 7596 | 1004 | 1908 | 1908 | 2 |
| 12001 | 1004 | 700 | 700 | 3 |
+----------+-------------+-------+-------+-----+
Then using TOP 1 WITH TIES construction we leave only rows where rnk equals 1.

you can try like this.
SELECT TOP 1 sales FROM MyTable WHERE agreemtneid > 1 ORDER BY sales DESC

I really do not know the business logic behind agreement_id > 1. It looks to me you want the max sales (with ties) by agreement id regardless of vendor_id.
First, lets create a simple sample database.
-- Sample table
create table #sales
(
vendor_id int,
agreement_id int,
sales_amt money
);
-- Sample data
insert into #sales values
(12001, 1004, 700),
(5291, 1004, 20576),
(7596, 1004, 1908),
(45, 103, 345),
(41, 103, 9087);
Second, let's solve this problem using a common table expression to get a result set that has each row paired with the max sales by agreement id.
The select statement just applies the business logic to filter the data to get your answer.
-- CTE = max sales for each agreement id
;
with cte_sales as
(
select
vendor_id,
agreement_id,
sales_amt,
max(sales_amt) OVER(PARTITION BY agreement_id) AS max_sales
from
#sales
)
-- Filter by your business logic
select * from cte_sales where sales_amt = max_sales and agreement_id > 1;
The screen shot below shows the exact result you wanted.

Related

Count number of sale by order amount

I'm using SQL Server 2008 R2 and doing a analysis on a table that contains CustomerID, OrderAmount, RegionID. I need to count number of orders in different categories according to the OrderAmount in each region. And if there is no sales in the category, returns 0.
Sample of data:
CustomerID | OrderAmount | RegionID
10001 | 50 | 801
10002 | 25 | 801
10003 | 200 | 802
10001 | 100 | 802
10002 | 20 | 802
...
And my expected result is:
RegionID | CategoryID | Num_of_Sales
801 | 1 | 2 -----Below 100
801 | 2 | 0 -----100-200
802 | 1 | 2 -----Below 100
802 | 2 | 1 -----100-200
...
My question is:
1. How to return 0 on the category that is empty?
2. Is there a better way to write the code?(Not using UNION)
WITH Category1 AS(
SELECT * FROM Sales_Table
WHERE NewAmount <= 100
)
, Category2 AS(
SELECT * FROM Sales_Table
WHERE NewAmount BETWEEN 101 AND 200
)
, [...]
SELECT Region_ID, CategoryID, Num_of_Sales
FROM (
SELECT Region_ID, COUNT(*) AS [Num_of_Sales], 1 AS CategoryID
FROM Category1
GROUP BY Region_ID
UNION
SELECT Region_ID, COUNT(*) AS [Num_of_Sales], 2 AS CategoryID
FROM Category2
GROUP BY Region_ID
UNION
[...]
)z
ORDER BY Region_ID, CategoryID
So, I use these code and get my result, but the count did not return 0 on the 100-200 Category at Region 801.
A table holding RegionID and CategoryID is needed for what you are trying to achieve. Then we can use that table to do a join as shown below.
With RegCatSales as
(
select RegionID,C,COUNT(*) AS [Num_of_Sales]
from
(
select RegionID,OrderAmount,
CASE
WHEN OrderAmount <= 100 THEN 1
WHEN OrderAmount BETWEEN 101 AND 200 THEN 2
END as C
from Sales_Table x
) xx
group by RegionID, C
),
Regions as
(
select distinct RegionID from RegCatSales
),
Categories as
(
select distinct C from RegCatSales
),
RegCat AS(
select distinct RegionID, C as CategoryID from Regions,Categories
)
select rc.RegionID,rc.CategoryID, ISNULL([Num_of_Sales],0) NUM_Of_Sales from
RegCatSales rcs
right join RegCat rc
on rc.RegionID= rcs.RegionID and rc.CategoryID = rcs.C
order by rc.RegionID, rc.CategoryID

Rank by top customers within each separate month -

I am having trouble ranking top customers by month. I created a new Rank column - but how do I break it up by month? Any help plz. Code and tables below:
The logic for ranking is selecting the top two customers per month from the tables. Also wrapped into the code (attempted at least) is renaming the date field and setting it to reflect end of month date only.
SELECT * FROM table1;
UPDATE table1
SET DATE=EOMONTH(DATE) AS MO_END;
ALTER TABLE table1
ADD COLUMN RANK INT AFTER SALES;
UPDATE table1
SET RANK=
RANK() OVER(PARTITION BY cust ORDER BY sales DESC);
LIMIT 2
Starting wtih
------+----------+-------+--+
| CUST | DATE | SALES | |
+------+----------+-------+--+
| 36 | 3-5-2018 | 50 | |
| 37 | 3-15-18 | 100 | |
| 38 | 3-25-18 | 65 | |
| 37 | 4-5-18 | 95 | |
| 39 | 4-21-18 | 500 | |
| 40 | 4-45-18 | 199 | |
+------+----------+-------+--+
desired end result
+------+---------+-------+------+--+
| CUST | MO_END | SALES | RANK | |
+------+---------+-------+------+--+
| 37 | 3-31-18 | 100 | 1 | |
| 38 | 3-25-18 | 65 | 2 | |
| 39 | 4-30-18 | 500 | 1 | |
| 40 | 4-45-18 | 199 | 2 | |
+------+---------+-------+------+--+
As a simple selection:
select *
from (
select
table1.*
, DENSE_RANK() OVER(PARTITION BY cust, EOMONTH(DATE) ORDER BY sales DESC) as ranking
from table1
)
where ranking < 3
;
If storing is important: I would not use [rank] as a column name as I avoid any words that are used in SQL, maybe [sales_rank] or similar.
with cte as (
select
cust
, DENSE_RANK() OVER(PARTITION BY cust, EOMONTH(DATE) ORDER BY sales DESC) as ranking
from table1
)
update cte
set sales_rank = ranking
where ranking < 3
;
There is really no reason to store the end of month, just use that function within the partition of the over() clause.
LIMIT 2 is not something that can be used in SQL Server by the way, and it sure can't be used "per grouping". When you use a "window function" such as rank() or dense_rank() you can use the output of those in the where clause of the next "layer". i.e. use those functions in a subquery (or cte) and then use a where clause to filter rows by the calculated values.
Also note I used dense_rank() to guarantee that no rank numbers are skipped, so that the subsequent where clause will be effective.

Window function to count occurrences in last 10 minutes

I can use a traditional subquery approach to count the occurrences in the last ten minutes. For example, this:
drop table if exists [dbo].[readings]
go
create table [dbo].[readings](
[server] [int] NOT NULL,
[sampled] [datetime] NOT NULL
)
go
insert into readings
values
(1,'20170101 08:00'),
(1,'20170101 08:02'),
(1,'20170101 08:05'),
(1,'20170101 08:30'),
(1,'20170101 08:31'),
(1,'20170101 08:37'),
(1,'20170101 08:40'),
(1,'20170101 08:41'),
(1,'20170101 09:07'),
(1,'20170101 09:08'),
(1,'20170101 09:09'),
(1,'20170101 09:11')
go
-- Count in the last 10 minutes - example periods 08:31 to 08:40, 09:12 to 09:21
select server,sampled,(select count(*) from readings r2 where r2.server=r1.server and r2.sampled <= r1.sampled and r2.sampled > dateadd(minute,-10,r1.sampled)) as countinlast10minutes
from readings r1
order by server,sampled
go
How can I use a window function to obtain the same result ? I've tried this:
select server,sampled,
count(case when sampled <= r1.sampled and sampled > dateadd(minute,-10,r1.sampled) then 1 else null end) over (partition by server order by sampled rows between unbounded preceding and current row) as countinlast10minutes
-- count(case when currentrow.sampled <= r1.sampled and currentrow.sampled > dateadd(minute,-10,r1.sampled) then 1 else null end) over (partition by server order by sampled rows between unbounded preceding and current row) as countinlast10minutes
from readings r1
order by server,sampled
But the result is just the running count. Any system variable that refers to the current row pointer ? currentrow.sampled ?
This isn't a very pleasing answer but one possibility is to first create a helper table with all the minutes
CREATE TABLE #DateTimes(datetime datetime primary key);
WITH E1(N) AS
(
SELECT 1 FROM (VALUES(1),(1),(1),(1),(1),
(1),(1),(1),(1),(1)) V(N)
) -- 1*10^1 or 10 rows
, E2(N) AS (SELECT 1 FROM E1 a, E1 b) -- 1*10^2 or 100 rows
, E4(N) AS (SELECT 1 FROM E2 a, E2 b) -- 1*10^4 or 10,000 rows
, E8(N) AS (SELECT 1 FROM E4 a, E4 b) -- 1*10^8 or 100,000,000 rows
,R(StartRange, EndRange)
AS (SELECT MIN(sampled),
MAX(sampled)
FROM readings)
,N(N)
AS (SELECT ROW_NUMBER()
OVER (
ORDER BY (SELECT NULL)) AS N
FROM E8)
INSERT INTO #DateTimes
SELECT TOP (SELECT 1 + DATEDIFF(MINUTE, StartRange, EndRange) FROM R) DATEADD(MINUTE, N.N - 1, StartRange)
FROM N,
R;
And then with that in place you could use ROWS BETWEEN 9 PRECEDING AND CURRENT ROW
WITH T1 AS
( SELECT Server,
MIN(sampled) AS StartRange,
MAX(sampled) AS EndRange
FROM readings
GROUP BY Server )
SELECT Server,
sampled,
Cnt
FROM T1
CROSS APPLY
( SELECT r.sampled,
COUNT(r.sampled) OVER (ORDER BY N.datetime ROWS BETWEEN 9 PRECEDING AND CURRENT ROW) AS Cnt
FROM #DateTimes N
LEFT JOIN readings r
ON r.sampled = N.datetime
AND r.server = T1.server
WHERE N.datetime BETWEEN StartRange AND EndRange ) CA
WHERE CA.sampled IS NOT NULL
ORDER BY sampled
The above assumes that there is at most one sample per minute and that all the times are exact minutes. If this isn't true it would need another table expression pre-aggregating by datetimes rounded to the minute.
As far as I know, there is not a simple exact replacement for your subquery using window functions.
Window functions operate on a set of rows and allow you to work with them based on partitions and order.
What you are trying to do isn't the type of partitioning that we can work with in window functions.
To generate the partitions we would need to be able to use window functions in this instance would just result in overly complicated code.
I would suggest cross apply() as an alternative to your subquery.
I am not sure if you meant to restrict your results to within 9 minutes, but with sampled > dateadd(...) that is what is happening in your original subquery.
Here is what a window function could look like based on partitioning your samples into 10 minute windows, along with a cross apply() version.
select
r.server
, r.sampled
, CrossApply = x.CountRecent
, OriginalSubquery = (
select count(*)
from readings s
where s.server=r.server
and s.sampled <= r.sampled
/* doesn't include 10 minutes ago */
and s.sampled > dateadd(minute,-10,r.sampled)
)
, Slices = count(*) over(
/* partition by server, 10 minute slices, not the same thing*/
partition by server, dateadd(minute,datediff(minute,0,sampled)/10*10,0)
order by sampled
)
from readings r
cross apply (
select CountRecent=count(*)
from readings i
where i.server=r.server
/* changed to >= */
and i.sampled >= dateadd(minute,-10,r.sampled)
and i.sampled <= r.sampled
) as x
order by server,sampled
results: http://rextester.com/BMMF46402
+--------+---------------------+------------+------------------+--------+
| server | sampled | CrossApply | OriginalSubquery | Slices |
+--------+---------------------+------------+------------------+--------+
| 1 | 01.01.2017 08:00:00 | 1 | 1 | 1 |
| 1 | 01.01.2017 08:02:00 | 2 | 2 | 2 |
| 1 | 01.01.2017 08:05:00 | 3 | 3 | 3 |
| 1 | 01.01.2017 08:30:00 | 1 | 1 | 1 |
| 1 | 01.01.2017 08:31:00 | 2 | 2 | 2 |
| 1 | 01.01.2017 08:37:00 | 3 | 3 | 3 |
| 1 | 01.01.2017 08:40:00 | 4 | 3 | 1 |
| 1 | 01.01.2017 08:41:00 | 4 | 3 | 2 |
| 1 | 01.01.2017 09:07:00 | 1 | 1 | 1 |
| 1 | 01.01.2017 09:08:00 | 2 | 2 | 2 |
| 1 | 01.01.2017 09:09:00 | 3 | 3 | 3 |
| 1 | 01.01.2017 09:11:00 | 4 | 4 | 1 |
+--------+---------------------+------------+------------------+--------+
Thanks, Martin and SqlZim, for your answers. I'm going to raise a Connect enhancement request for something like %%currentrow that can be used in window aggregates. I'm thinking this would lead to much more simple and natural sql:
select count(case when sampled <= %%currentrow.sampled and sampled > dateadd(minute,-10,%%currentrow.sampled) then 1 else null end) over (...whatever the window is...)
We can already use expressions like this:
select count(case when sampled <= getdate() and sampled > dateadd(minute,-10,getdate()) then 1 else null end) over (...whatever the window is...)
so thinking would be great if we could reference a column that's in the current row.

Where clause if there are multiple of the same ID

I have following table:
ID | source | Name | Age | ... | ...
1 | SQL | John | 18 | ... | ...
2 | SAP | Mike | 21 | ... | ...
2 | SQL | Mike | 20 | ... | ...
3 | SAP | Jill | 25 | ... | ...
I want to have one record for each ID. The idea behind this is that if the ID comes only once (no matter the Source), that record will be taken. But, If there are 2 records for one ID, the one containing SQL as source will be the used record here.
So, In this case, the result will be:
ID | source | Name | Age | ... | ...
1 | SQL | John | 18 | ... | ...
2 | SQL | Mike | 20 | ... | ...
3 | SAP | Jill | 25 | ... | ...
I did this with a partition over (ordered by Source desc), but that wouldn't work well if a third source will be added one day.
Any other options/ideas?
The easiest approach(in my opinion) is using a CTE with a ranking function:
with cte as
(
select ID, source, Name, Age, ... ,
rn = row_number() over (partition by ID order by case when source = 'sql'
then 0 else 1 end asc)
from dbo.tablename
)
select ID, source, Name, Age, ...
from cte
where rn = 1
You can use ROW_NUMBER:
WITH CTE AS
(
SELECT *,
RN = ROW_NUMBER() OVER( PARTITION BY ID
ORDER BY CASE WHEN [Source] = 'SQL' THEN 1 ELSE 2 END)
FROM dbo.YourTable
)
SELECT *
FROM CTE
WHERE RN = 1;
You can use the WITH TIES clause and the window function Row_Number()
Select Top 1 With Ties *
From YourTable
Order By Row_Number() over (Partition By ID Order By Case When Source = 'SQL' Then 0 Else 1 End)
How about
SELECT *
FROM table
WHERE ID in (
SELECT ID FROM test
group by ID
having count(ID) = 1)
OR source = 'SQL'

SQL Query combine multiple results or Join tables

I have the Following tables
Disposition Table
Dis_ID | OfferID | RequestID
------------------------------------
34564 | 123 | 9
77456 | 123 | 8
25252 | 124 | 7
46464 | 125 | 10
36464 | 125 | 6
35353 | 125 | 5
Request Table
RequestID | AccountNum |
---------------------------
5 | 548543 |
6 | 548543 |
7 | 684567 |
8 | 684567 |
9 | 684567 |
10 | 548543 |
11 | 684567 |
Rank Table
RankID | OfferId | RequestID | Score
-------------------------------------------
34564 | 123 | 11 | 1
77456 | 124 | 11 | 2
25252 | 125 | 11 | 3
Using the data above I need a query which would behave as follows given a request number look at every record in the Rank Table in this example we have 3 (123, 124, & 125). return the OfferId that appears the fewest times in the Disposition table for this joined account number. in this example offerId 123 appears twice for this account number, offerId 124 appears once and offerId 125 doesn't appears at all for this account number. So offerId 125 should be returned. The offerId which exist in the Rank Table with the fewest appearances in the Disposition table should always be returned unless they are all the same then return the offerId with the lowest value in the Score field. for example if none of the offerIDs appeared in the Dispostion table offerId 123 would return since its Score value is 1.
Resulting table would look something like this
| OfferId | Score | Dis_Occurrences
---------------------------------------------------------------
| 123 | 1 | 2
| 124 | 2 | 1
| 125 | 3 | 0 <--Return this record
This is what I have so far.
SELECT oRank.OfferId, oRank.Rank_Number, count(oRank.OfferId) AS NumDispositions
From Rank oRank
join Request req
on oRank.RequestId = req.RequestId
join Disposition dis
on oRank.OfferId = dis.OfferId
where req.Customer_Account_Number = 684567 and req.RequestId = 11 and oRank.OfferId = dis.OfferId
group by oRank.Rank_Number, oRank.OfferId
order by NumDispositions, oRank.Rank_Number
My incorrect Resulting table looks like this
| OfferId | Score | Dis_Occurrences
---------------------------------------------------------------
| 123 | 1 | 2
| 124 | 2 | 1
| 125 | 3 | 3
It is counting the total number of times the offerId appears in the Disposition Table
EDIT - based on author's comments, here's another version:
Example in SQLFiddle: http://sqlfiddle.com/#!6/d3f99/1/0
with RankReqMap as (
select rnk.OfferId, rnk.Score, reqAcct.AccountNum, reqReq.RequestID
from [Rank] rnk
left join Request reqAcct on reqAcct.RequestID = rnk.RequestID
left join Request reqReq on reqReq.AccountNum = reqAcct.AccountNum
where rnk.RequestID = 11 -- Put your RequestId filter here
)
select oRank.OfferId
,oRank.Score
,count(dis.RequestID) as NumDispositions
from RankReqMap oRank
left join Disposition dis on dis.OfferID = oRank.OfferId
and dis.RequestID = oRank.RequestID
group by oRank.OfferId , oRank.Score
order by NumDispositions, oRank.Score;
ORIGINAL POST
Example in SQLFiddle: http://sqlfiddle.com/#!6/770a8/1/0
This query makes the assumption that you're joining Disposition to Rank based on OfferID, since the RequestIDs for those tables in your example data don't match up. You may have to tweak depending on your needs, but something like the query below should get you the record you're looking for:
-- Gather base data
with RankData as (
select rnk.RankID
,rnk.OfferID
,rnk.RequestID
,rnk.Score
,Dis_Occurrences = count(dis.OfferID)
from dbo.[Rank] rnk
left join dbo.Disposition dis on dis.OfferID = rnk.OfferId
left join dbo.Request req on req.RequestID = rnk.RequestID
group by rnk.RankID, rnk.OfferID, rnk.RequestID, rnk.Score
)
-- Rank count of Dis_Occurrences, taking lowest score into account as a tie breaker
, DispRanking as (
select rdt.*, Dis_Rank = row_number() over (order by Dis_Occurrences asc, rdt.Score asc)
from RankData rdt
)
-- Return only the value with the highest ranking
select * from DispRanking where Dis_Rank = 1
Note also that if you convert the second CTE into a naked SELECT and remove the SELECT statement at the end, you can see all of the records and how they get ranked by the row_number() function:
-- Gather base data
with RankData as (
select rnk.RankID
,rnk.OfferID
,rnk.RequestID
,rnk.Score
,Dis_Occurrences = count(dis.OfferID)
from dbo.[Rank] rnk
left join dbo.Disposition dis on dis.OfferID = rnk.OfferId
left join dbo.Request req on req.RequestID = rnk.RequestID
group by rnk.RankID, rnk.OfferID, rnk.RequestID, rnk.Score
)
-- Output all values, with rankings
select rdt.*, Dis_Rank = row_number() over (order by Dis_Occurrences asc, rdt.Score asc)
from RankData rdt
Good luck!
I think you can use window function for this:
;with disp as(select offerid, count(*) as ocount
from dispositions group by offerid),
rnk as(select r.offerid,
row_number() over(partition by r.requestid
order by isnull(d.ocount, 0), r.score) rn
from ranks r
left join disp d on r.offerid = d.offerid)
select * from rnk where rn = 1

Resources