SQL Query combine multiple results or Join tables - sql-server

I have the Following tables
Disposition Table
Dis_ID | OfferID | RequestID
------------------------------------
34564 | 123 | 9
77456 | 123 | 8
25252 | 124 | 7
46464 | 125 | 10
36464 | 125 | 6
35353 | 125 | 5
Request Table
RequestID | AccountNum |
---------------------------
5 | 548543 |
6 | 548543 |
7 | 684567 |
8 | 684567 |
9 | 684567 |
10 | 548543 |
11 | 684567 |
Rank Table
RankID | OfferId | RequestID | Score
-------------------------------------------
34564 | 123 | 11 | 1
77456 | 124 | 11 | 2
25252 | 125 | 11 | 3
Using the data above I need a query which would behave as follows given a request number look at every record in the Rank Table in this example we have 3 (123, 124, & 125). return the OfferId that appears the fewest times in the Disposition table for this joined account number. in this example offerId 123 appears twice for this account number, offerId 124 appears once and offerId 125 doesn't appears at all for this account number. So offerId 125 should be returned. The offerId which exist in the Rank Table with the fewest appearances in the Disposition table should always be returned unless they are all the same then return the offerId with the lowest value in the Score field. for example if none of the offerIDs appeared in the Dispostion table offerId 123 would return since its Score value is 1.
Resulting table would look something like this
| OfferId | Score | Dis_Occurrences
---------------------------------------------------------------
| 123 | 1 | 2
| 124 | 2 | 1
| 125 | 3 | 0 <--Return this record
This is what I have so far.
SELECT oRank.OfferId, oRank.Rank_Number, count(oRank.OfferId) AS NumDispositions
From Rank oRank
join Request req
on oRank.RequestId = req.RequestId
join Disposition dis
on oRank.OfferId = dis.OfferId
where req.Customer_Account_Number = 684567 and req.RequestId = 11 and oRank.OfferId = dis.OfferId
group by oRank.Rank_Number, oRank.OfferId
order by NumDispositions, oRank.Rank_Number
My incorrect Resulting table looks like this
| OfferId | Score | Dis_Occurrences
---------------------------------------------------------------
| 123 | 1 | 2
| 124 | 2 | 1
| 125 | 3 | 3
It is counting the total number of times the offerId appears in the Disposition Table

EDIT - based on author's comments, here's another version:
Example in SQLFiddle: http://sqlfiddle.com/#!6/d3f99/1/0
with RankReqMap as (
select rnk.OfferId, rnk.Score, reqAcct.AccountNum, reqReq.RequestID
from [Rank] rnk
left join Request reqAcct on reqAcct.RequestID = rnk.RequestID
left join Request reqReq on reqReq.AccountNum = reqAcct.AccountNum
where rnk.RequestID = 11 -- Put your RequestId filter here
)
select oRank.OfferId
,oRank.Score
,count(dis.RequestID) as NumDispositions
from RankReqMap oRank
left join Disposition dis on dis.OfferID = oRank.OfferId
and dis.RequestID = oRank.RequestID
group by oRank.OfferId , oRank.Score
order by NumDispositions, oRank.Score;
ORIGINAL POST
Example in SQLFiddle: http://sqlfiddle.com/#!6/770a8/1/0
This query makes the assumption that you're joining Disposition to Rank based on OfferID, since the RequestIDs for those tables in your example data don't match up. You may have to tweak depending on your needs, but something like the query below should get you the record you're looking for:
-- Gather base data
with RankData as (
select rnk.RankID
,rnk.OfferID
,rnk.RequestID
,rnk.Score
,Dis_Occurrences = count(dis.OfferID)
from dbo.[Rank] rnk
left join dbo.Disposition dis on dis.OfferID = rnk.OfferId
left join dbo.Request req on req.RequestID = rnk.RequestID
group by rnk.RankID, rnk.OfferID, rnk.RequestID, rnk.Score
)
-- Rank count of Dis_Occurrences, taking lowest score into account as a tie breaker
, DispRanking as (
select rdt.*, Dis_Rank = row_number() over (order by Dis_Occurrences asc, rdt.Score asc)
from RankData rdt
)
-- Return only the value with the highest ranking
select * from DispRanking where Dis_Rank = 1
Note also that if you convert the second CTE into a naked SELECT and remove the SELECT statement at the end, you can see all of the records and how they get ranked by the row_number() function:
-- Gather base data
with RankData as (
select rnk.RankID
,rnk.OfferID
,rnk.RequestID
,rnk.Score
,Dis_Occurrences = count(dis.OfferID)
from dbo.[Rank] rnk
left join dbo.Disposition dis on dis.OfferID = rnk.OfferId
left join dbo.Request req on req.RequestID = rnk.RequestID
group by rnk.RankID, rnk.OfferID, rnk.RequestID, rnk.Score
)
-- Output all values, with rankings
select rdt.*, Dis_Rank = row_number() over (order by Dis_Occurrences asc, rdt.Score asc)
from RankData rdt
Good luck!

I think you can use window function for this:
;with disp as(select offerid, count(*) as ocount
from dispositions group by offerid),
rnk as(select r.offerid,
row_number() over(partition by r.requestid
order by isnull(d.ocount, 0), r.score) rn
from ranks r
left join disp d on r.offerid = d.offerid)
select * from rnk where rn = 1

Related

How to get value conditionally from another row in sub table

Select * from LoanAccount main INNER JOIN LoanSubAccount sub
WHERE main.LoanAccountID = sub.LoanAccountID
AND sub.LoanStatus = 4
My objective is to retrieve rows with LoanStatus = 4 but replace the amount with records with LoanStatus = 2.
End result expected to be
WITH cte AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY LoanAccountID, LoanStatus
ORDER BY LoanSubAccountID) rn
FROM LoanSubAccount
)
SELECT t1.LoanSubAccountID,
t1.LoanAccountID,
t1.LoanStatus,
t1.CommodityType,
t2.Amount
FROM cte t1
INNER JOIN cte t2
ON t1.rn = t2.rn AND
t1.LoanStatus > t2.LoanStatus
Rather than giving a verbose explanation, I would rather show a table representing what the above CTE would look like:
rn | LoanSubAccountID | LoanAccountID | LoanStatus | CommodityType | Amount
1 | 1 | 1 | 2 | 1 | 100
2 | 2 | 1 | 2 | 2 | 200
1 | 3 | 1 | 4 | 3 | 150
2 | 4 | 1 | 4 | 4 | 150
If I read your requirement correctly, you want to connect rows having the same row number from the two different loan statuses. The join query I gave above does this.

Where clause if there are multiple of the same ID

I have following table:
ID | source | Name | Age | ... | ...
1 | SQL | John | 18 | ... | ...
2 | SAP | Mike | 21 | ... | ...
2 | SQL | Mike | 20 | ... | ...
3 | SAP | Jill | 25 | ... | ...
I want to have one record for each ID. The idea behind this is that if the ID comes only once (no matter the Source), that record will be taken. But, If there are 2 records for one ID, the one containing SQL as source will be the used record here.
So, In this case, the result will be:
ID | source | Name | Age | ... | ...
1 | SQL | John | 18 | ... | ...
2 | SQL | Mike | 20 | ... | ...
3 | SAP | Jill | 25 | ... | ...
I did this with a partition over (ordered by Source desc), but that wouldn't work well if a third source will be added one day.
Any other options/ideas?
The easiest approach(in my opinion) is using a CTE with a ranking function:
with cte as
(
select ID, source, Name, Age, ... ,
rn = row_number() over (partition by ID order by case when source = 'sql'
then 0 else 1 end asc)
from dbo.tablename
)
select ID, source, Name, Age, ...
from cte
where rn = 1
You can use ROW_NUMBER:
WITH CTE AS
(
SELECT *,
RN = ROW_NUMBER() OVER( PARTITION BY ID
ORDER BY CASE WHEN [Source] = 'SQL' THEN 1 ELSE 2 END)
FROM dbo.YourTable
)
SELECT *
FROM CTE
WHERE RN = 1;
You can use the WITH TIES clause and the window function Row_Number()
Select Top 1 With Ties *
From YourTable
Order By Row_Number() over (Partition By ID Order By Case When Source = 'SQL' Then 0 Else 1 End)
How about
SELECT *
FROM table
WHERE ID in (
SELECT ID FROM test
group by ID
having count(ID) = 1)
OR source = 'SQL'

Removing lines from a query

I have the following table:
OrderID | OldOrderID | Action | EntryDate | Source
1 | NULL | Insert | 2016-01-12| A
1 | NULL | Remove | 2016-01-13| A
2 | NULL | Insert | 2016-01-12| B
3 | NULL | Insert | 2016-01-12| C
4 | 3 | Insert | 2016-01-13| C
4 | NULL | Remove | 2016-01-14| C
I want to query all orders that are currently active orders - they dont have the action remove. Currently I do it with this query :
WITH Active AS
(
SELECT *, rn = ROW_NUMBER()
OVER (PARTITION BY OrderID,Source ORDER BY EntryDate DESC)
FROM Orders
)
SELECT *
FROM Active WHERE [Action] <> 'Remove' AND rn = 1;
The problem is that some orders get child orders (OrderID 3 gets a child OrderID 4) and if a child ever gets the Action Remove the query should also ignore the parent, but with the current query it dosent.
In short the current query gets me this result:
OrderID | OldOrderID | Action | EntryDate | Source
2 | NULL | Insert | 2016-01-12| B
3 | NULL | Insert | 2016-01-12| C
But I need this result:
OrderID | OldOrderID | Action | EntryDate | Source
2 | NULL | Insert | 2016-01-12| B
Is it possible to fix the query to get a result like this?
Try this:
;WITH CTE AS (
SELECT OrderID, OldOrderID, Action, EntryDate, Source,
COUNT(CASE WHEN Action = 'Remove' THEN 1 END)
OVER (PARTITION BY OrderID) AS IsRemoved,
ROW_NUMBER() OVER (PARTITION BY OrderID ORDER BY EntryDate) AS rn
FROM Orders
)
SELECT c1.*
FROM CTE AS c1
LEFT JOIN CTE AS c2 ON c1.OrderID = c2.OldOrderID AND c2.IsRemoved >= 1
WHERE c1.rn = 1 AND c1.IsRemoved = 0 AND c2.IsRemoved IS NULL
The above query uses COUNT() OVER() in order to count the number of occurrences of Action = 'Remove' within each OrderID partition. Hence, a value of IsRemoved that is equal to or greater than 1 identifies a 'removed' order.
I also asked the question on dba stackexchange and got the following answer, which works well.

The highest value from list-distinct

Can anyone help me with query, I have table
vendorid, agreementid, sales
12001 1004 700
5291 1004 20576
7596 1004 1908
45 103 345
41 103 9087
what is the goal ?
when agreemtneid >1 then show me data when sales is the highest
vendorid agreementid sales
5291 1004 20576
41 103 9087
Any ideas ?
Thx
Well you could try using a CTE and ROW_NUMBER something like
;WITH Vals AS (
SELECT *, ROW_NUMBER() OVER(PARTITION BY AgreementID ORDER BY Sales DESC) RowID
FROM MyTable
WHERE AgreementID > 1
)
SELECT *
FROM Vals
WHERE RowID = 1
This will avoid you returning multiple records with the same sale.
If that was OK you could try something like
SELECT *
FROM MyTable mt INNER JOIN
(
SELECT AgreementID, MAX(Sales) MaxSales
FROM MyTable
WHERE AgreementID > 1
) MaxVals ON mt.AgreementID = MaxVals.AgreementID AND mt.Sales = MaxVals.MaxSales
SELECT TOP 1 WITH TIES *
FROM MyTable
ORDER BY DENSE_RANK() OVER(PARTITION BY agreementid ORDER BY SIGN (SIGN (agreementid - 2) + 1) * sales DESC)
Explanation
We break table MyTable into partitions by agreementid.
For each partition we construct a ranking or its rows.
If agreementid is greater than 1 ranking will be equal to ORDER BY sales DESC.
Otherwise ranking for every single row in partition will be the same: ORDER BY 0 DESC.
See how it looks like:
SELECT *
, SIGN (SIGN (agreementid - 2) + 1) * sales AS x
, DENSE_RANK() OVER(PARTITION BY agreementid ORDER BY SIGN (SIGN (agreementid - 2) + 1) * sales DESC) AS rnk
FROM MyTable
+----------+-------------+-------+-------+-----+
| vendorid | agreementid | sales | x | rnk |
+----------|-------------|-------+-------+-----+
| 0 | 0 | 3 | 0 | 1 |
| -1 | 0 | 7 | 0 | 1 |
| 0 | 1 | 3 | 0 | 1 |
| -1 | 1 | 7 | 0 | 1 |
| 41 | 103 | 9087 | 9087 | 1 |
| 45 | 103 | 345 | 345 | 2 |
| 5291 | 1004 | 20576 | 20576 | 1 |
| 7596 | 1004 | 1908 | 1908 | 2 |
| 12001 | 1004 | 700 | 700 | 3 |
+----------+-------------+-------+-------+-----+
Then using TOP 1 WITH TIES construction we leave only rows where rnk equals 1.
you can try like this.
SELECT TOP 1 sales FROM MyTable WHERE agreemtneid > 1 ORDER BY sales DESC
I really do not know the business logic behind agreement_id > 1. It looks to me you want the max sales (with ties) by agreement id regardless of vendor_id.
First, lets create a simple sample database.
-- Sample table
create table #sales
(
vendor_id int,
agreement_id int,
sales_amt money
);
-- Sample data
insert into #sales values
(12001, 1004, 700),
(5291, 1004, 20576),
(7596, 1004, 1908),
(45, 103, 345),
(41, 103, 9087);
Second, let's solve this problem using a common table expression to get a result set that has each row paired with the max sales by agreement id.
The select statement just applies the business logic to filter the data to get your answer.
-- CTE = max sales for each agreement id
;
with cte_sales as
(
select
vendor_id,
agreement_id,
sales_amt,
max(sales_amt) OVER(PARTITION BY agreement_id) AS max_sales
from
#sales
)
-- Filter by your business logic
select * from cte_sales where sales_amt = max_sales and agreement_id > 1;
The screen shot below shows the exact result you wanted.

How do I utilize Row_Number() (partitioning) for my datapool correctly

we have following table (output is already ordered and separated for understanding):
| PK | FK1 | FK2 | ActionCode | CreationTS | SomeAttributeValue |
+----+-----+-----+--------------+---------------------+--------------------+
| 6 | 100 | 500 | Create | 2011-01-02 00:00:00 | H |
----------------------------------------------------------------------------
| 3 | 100 | 500 | Change | 2011-01-01 02:00:00 | Z |
| 2 | 100 | 500 | Change | 2011-01-01 01:00:00 | X |
| 1 | 100 | 500 | Create | 2011-01-01 00:00:00 | Y |
----------------------------------------------------------------------------
| 4 | 100 | 510 | Create | 2011-01-01 00:30:00 | T |
----------------------------------------------------------------------------
| 5 | 100 | 520 | CreateSystem | 2011-01-01 00:30:00 | A |
----------------------------------------------------------------------------
what is ActionCode? we use this in c# and there it represents an enum-value
what do i want to achieve?
well, i need the following output:
| FK1 | FK2 | ActionCode | SomeAttributeValue |
+-----+-----+--------------+--------------------+
| 100 | 500 | Create | H |
| 100 | 500 | Create | Z |
| 100 | 510 | Create | T |
| 100 | 520 | CreateSystem | A |
-------------------------------------------------
well, what is the actual logic?
we have some logical groups for composite-key (FK1 + FK2). each of these groups can be broken into partitions, which begin with Create or CreateSystem. each partition ends with Create, CreateSystem or Change. The actual value of SomeAttributeValue for each partition should be the value from the last line of the partition.
it is not possible to have following datapool:
| PK | FK1 | FK2 | ActionCode | CreationTS | SomeAttributeValue |
+----+-----+-----+--------------+---------------------+--------------------+
| 7 | 100 | 500 | Change | 2011-01-02 02:00:00 | Z |
| 6 | 100 | 500 | Create | 2011-01-02 00:00:00 | H |
| 2 | 100 | 500 | Change | 2011-01-01 01:00:00 | X |
| 1 | 100 | 500 | Create | 2011-01-01 00:00:00 | Y |
----------------------------------------------------------------------------
and then expect PK 7 to affect PK 2 or PK 6 to affect PK 1.
i don't even know how/where to start ... how can i achieve this?
we are running on mssql 2005+
EDIT:
there's a dump available:
instanceId: my PK
tenantId: FK 1
campaignId: FK 2
callId: FK 3
refillCounter: FK 4
ticketType: ActionCode (1 & 4 & 6 are Create, 5 is Change, 3 must be ignored)
ticketType, profileId, contactPersonId, ownerId, handlingStartTime, handlingEndTime, memo, callWasPreselected, creatorId, creationTS, changerId, changeTS should be taken from the Create (first line in partition in groups)
callingState, reasonId, followUpDate, callingAttempts and callingAttemptsConsecutivelyNotReached should be taken from the last Create (which then would be a "one-line-partition-in-group" / the same as the upper one) or Change (last line in partition in groups)
I'm assuming that each partition can only contain a single Create or CreateSystem, otherwise your requirements are ill-defined. The following is untested, since I don't have a sample table, nor sample data in an easily consumed format:
;With Partitions as (
Select
t1.FK1,
t1.FK2,
t1.CreationTS as StartTS,
t2.CreationTS as EndTS
From
Table t1
left join
Table t2
on
t1.FK1 = t2.FK1 and
t1.FK2 = t2.FK2 and
t1.CreationTS < t2.CreationTS and
t2.ActionCode in ('Create','CreateSystem')
left join
Table t3
on
t1.FK1 = t3.FK1 and
t1.FK2 = t3.FK2 and
t1.CreationTS < t3.CreationTS and
t3.CreationTS < t2.CreationTS and
t3.ActionCode in ('Create','CreateSystem')
where
t1.ActionCode in ('Create','CreateSystem') and
t3.FK1 is null
), PartitionRows as (
SELECT
t1.FK1,
t1.FK2,
t1.ActionCode,
t2.SomeAttributeValue,
ROW_NUMBER() OVER (PARTITION_FRAGMENT_ID BY t1.FK1,T1.FK2,t1.StartTS ORDER BY t2.CreationTS desc) as rn
from
Partitions t1
inner join
Table t2
on
t1.FK1 = t2.FK1 and
t1.FK2 = t2.FK2 and
t1.StartTS <= t2.CreationTS and
(t2.CreationTS < t1.EndTS or t1.EndTS is null)
)
select * from PartitionRows where rn = 1
(Please note than I'm using all kinds of reserved names here)
The basic logic is: The Partitions CTE is used to define each partition in terms of the FK1, FK2, an inclusive start timestamp, and exclusive end timestamp. It does this by a triple join to the base table. the rows from t2 are selected to occur after the rows from t1, then the rows from t3 are selected to occur between the matching rows from t1 and t2. Then, in the WHERE clause, we exclude any rows from the result set where a match occurred from t3 - the result being that the row from t1 and the row from t2 represent the start of two adjacent partitions.
The second CTE then retrieves all rows from Table for each partition, but assigning a ROW_NUMBER() score within each partition, based on the CreationTS, sorted descending, with the result that ROW_NUMBER() 1 within each partition is the last row to occur.
Finally, within the select, we choose those rows that occur last within their respective partitions.
This does all assume that CreationTS values are distinct within each partition. I may be able to re-work it using PK also, if that assumption doesn't hold up.
It is solvable with a recursive CTE. Here (assuming rows within partitions are ordered by CreationTS):
WITH partitioned AS (
SELECT
*,
rn = ROW_NUMBER() OVER (PARTITION BY FK1, FK2 ORDER BY CreationTS)
FROM data
),
subgroups AS (
SELECT
PK, FK1, FK2, ActionCode, CreationTS, SomeAttributeValue, rn,
Subgroup = 1,
Subrank = 1
FROM partitioned
WHERE rn = 1
UNION ALL
SELECT
p.PK, p.FK1, p.FK2, p.ActionCode, p.CreationTS, p.SomeAttributeValue, p.rn,
Subgroup = s.Subgroup + CASE p.ActionCode WHEN 'Change' THEN 0 ELSE 1 END,
Subrank = CASE p.ActionCode WHEN 'Change' THEN s.Subrank ELSE 0 END + 1
FROM partitioned p
INNER JOIN subgroups s ON p.FK1 = s.FK1 AND p.FK2 = s.FK2
AND p.rn = s.rn + 1
),
finalranks AS (
SELECT
PK, FK1, FK2, ActionCode, CreationTS, SomeAttributeValue, rn,
Subgroup, Subrank,
rank = ROW_NUMBER() OVER (PARTITION BY FK1, FK2, Subgroup ORDER BY Subrank DESC)
/* or: rank = MAX(Subrank) OVER (PARTITION BY FK1, FK2, Subgroup) - Subrank + 1 */
FROM subgroups
)
SELECT PK, FK1, FK2, ActionCode, CreationTS, SomeAttributeValue
FROM finalranks
WHERE rank = 1

Resources