SQL recursion with full hierarchy - sql-server

This is what my query looks like right now:
with allmembers (objectid, parentid, name, parentname, recursion) as
(
-- anchor elements: where parentid = 25
select objectid, parentid, name, name as parentname, 0 as recursion
from orgs as orgs1
where parentid = 25
-- recursion begins here
union all
select orgs2.objectid, orgs2.parentid, orgs2.name, orgs3.name as parentname, recursion + 1
from orgs as orgs2
join allmembers as orgs3 on orgs2.parentid = orgs3.objectid
)
-- we select all the results
select *
from allmembers
It selects the orgs (organizations) from a list, where the parentid is 25 (these are the "root organizations") and joins them with all their child organizations, recursively, until there are none left. So we get a list of organizations and their parents.
My problem is that I get only the direct child/parents relationsships:
name | parentname
Sales | All_Employees
Direct Sales | Sales
What is lost in this process is that "Direct Sales" is also a member of "All_Employees", indirectly, through "Sales".
So I would rather have the following result added:
name | parentname
Sales | All_Employees
Direct Sales | Sales
Direct Sales | All_Employees
How to achieve this?

Without going too far and getting functions involved, would using a materialized path suffice for your needs?
create table orgs (objectid int, name varchar(128), parentid int);
insert into orgs values
(26,'All Employees', 25)
,(27,'Sales', 26)
,(28,'Direct Sales',27);
with allmembers as (
-- anchor elements: where parentid = 25
select
objectid
, parentid
, name
, parentname = convert(varchar(128),'')
, rootname = name
, recursion = convert(int,0)
, name_path = convert(varchar(256),name)
from orgs
where parentid = 25
-- recursion begins here
union all
select
c.objectid
, c.parentid
, c.name
, parentname = p.name
, rootname = p.rootname
, recursion = p.recursion + 1
, name_path = convert(varchar(256),p.name_path + ' > ' + c.name)
from orgs as c
join allmembers as p on c.parentid = p.objectid
)
-- we select all the results
select *
from allmembers
returns:
+----------+----------+---------------+---------------+---------------+-----------+--------------------------------------+
| objectid | parentid | name | parentname | rootname | recursion | name_path |
+----------+----------+---------------+---------------+---------------+-----------+--------------------------------------+
| 26 | 25 | All Employees | | All Employees | 0 | All Employees |
| 27 | 26 | Sales | All Employees | All Employees | 1 | All Employees > Sales |
| 28 | 27 | Direct Sales | Sales | All Employees | 2 | All Employees > Sales > Direct Sales |
+----------+----------+---------------+---------------+---------------+-----------+--------------------------------------+

Related

Concatenate Values from Different T-SQL Queries

I have a table in SQL Server with the following data:
+-----------------+-------------------+-------------------+--------+
|Product Family | Product Class | Product | Sales |
|Food | Vegetables | Cauliflower | 24 |
|Food | Prepared Meals | Steak & Patatoes | 54 |
|Food | Fruit | Apples | 76 |
|Food | Fruit | Oranges | 14 |
|Food | Fruit | Pears | 32 |
|Electronics | MP3 Players | Cool Player Z | 57 |
|Electronics | MP3 Players | iStuff 16GB | 45 |
|Electronics | TV's | HD | 96 |
|Electronics | TV's | Ultra HD | 76 |
+-----------------+-------------------+-------------------+--------+
There is a hierarchy in this data:
Product Family
Product Class
Product
I'd like to create a query that will return the sum for each hierarchy level. This union does that:
SELECT 1 as Level, [Product Family] as Item, SUM(SALES) as Sales
FROM [dbo].[HK_Termp_01] GROUP BY [Product Family]
UNION ALL
SELECT 2 as Level, [Product Class] as Item, SUM(SALES) as Sales
FROM [dbo].[HK_Termp_01] GROUP BY [Product Class]
UNION ALL
SELECT 3 as Level, Product as Item, SUM(SALES) as Sales
FROM [dbo].[HK_Termp_01] GROUP BY Product
However, I also require an additional column that will be a concatenation of the 3 string columns, in the order of the hierarchy. The desired output being:
+--------------------------+-----------------------------------------------+--------+
| Level ||Item | Hierarchy | Sales |
| 1 ||Electronics | Electronics | 274 |
| 1 ||Food | Food | 200 |
| 2 ||Fruit | Food > Fruit | 122 |
| 2 ||MP3 Players | Electronics > MP3 Players | 102 |
| 2 ||Prepared Meals | Food > Prepared Meals | 54 |
| 2 ||TV's | Electronics > TV's | 172 |
| 2 ||Vegetables | Food > Vegetables | 24 |
| 3 ||Apples | Food > Fruit > Apples | 76 |
| 3 ||Cauliflower | Food v Vegetables > Cauliflower | 24 |
| 3 ||Cool Player Z | Electronics > MP3 Players > Cool Player Z | 57 |
| 3 ||HD | Electronics > TV's > HD | 96 |
| 3 ||iStuff 16GB | Electronics v MP3 Players > iStuff 16GB | 45 |
| 3 ||Oranges | Food > Fruit > Oranges | 14 |
| 3 ||Pears | Food > Fruit v Pears | 32 |
| 3 ||Steak & Patatoes | Food v Prepared Meals > Steak & Patatoes | 54 |
| 3 ||Ultra HD | Electronics > TV's > Ultra HD | 76 |
+--------------------------+--------------+------+-------------------------+--------+
This is where I get stuck. I can't add all 3 fields to each query in the Union, because then I don't get the right totals by level. But I'm not sure what other avenue to try.
Thanks & Let me know what other info I can supply to clarify the case.
I think you just want a tweak on your query:
SELECT 1 as Level, [Product Family] as Item,
SUM(SALES) as Sales
FROM [dbo].[HK_Termp_01]
GROUP BY [Product Family]
UNION ALL
SELECT 2 as Level, [Product Family] + '>' + [Product Class] as Item,
SUM(SALES) as Sales
FROM [dbo].[HK_Termp_01]
GROUP BY [Product Family] + '>' + [Product Class]
UNION ALL
SELECT 3 as Level, [Product Family] + '>' + [Product Class] + '>' + Product as Item,
SUM(SALES) as Sales
FROM [dbo].[HK_Termp_01]
GROUP BY [Product Family] + '>' + [Product Class] + '>' + Product;
That said, you could do this using GROUPING_SETS:
SELECT [Product Family], [Product Class], Product, SUM(SALES) as Sales
FROM [dbo].[HK_Termp_01]
GROUP BY GROUPING SETS ( ([Product Family], [Product Class], Product),
([Product Family], [Product Class]),
([Product Family])
);
You would then need to fiddle with the names to get the exact output you want.
Just for fun,
Declare #YourTable table ([Product Family] varchar(50),[Product Class] varchar(50),Product varchar(50),Sales int)
Insert Into #YourTable values
('Food','Vegetables','Cauliflower',24),
('Food','Prepared Meals','Steak & Patatoes',54),
('Food','Fruit','Apples',76),
('Food','Fruit','Oranges',14),
('Food','Fruit','Pears',32),
('Electronics','MP3 Players','Cool Player Z',57),
('Electronics','MP3 Players','iStuff 16GB',45),
('Electronics','TV''s','HD',96),
('Electronics','TV''s','Ultra HD',76)
Declare #Top varchar(25) = NULL --<< Sets top of Hier Try ''MP3 Players''
Declare #Nest varchar(25) = '|-----' --<< Optional: Added for readability
;with cte0 as (
Select Distinct ID=Product,Parent=[Product Class],Sales from #YourTable
Union All
Select Distinct ID=[Product Class],Parent=[Product Family],0 from #YourTable
Union All
Select Distinct ID=[Product Family],Parent='Total',0 from #YourTable
Union All
Select Distinct ID='Total',Parent=NULL,0 )
,cteP as (
Select Seq = cast(100000+Row_Number() over (Order by ID) as varchar(500))
,ID
,Parent
,Lvl=1
,Sales = Sales
From cte0
Where IsNull(#Top,'X') = case when #Top is null then isnull(Parent,'X') else ID end
Union All
Select Seq = cast(concat(p.Seq,'.',100000+Row_Number() over (Order by r.ID)) as varchar(500))
,r.ID
,r.Parent
,p.Lvl+1
,r.Sales
From cte0 r
Join cteP p on r.Parent = p.ID)
,cteR1 as (Select *,R1=Row_Number() over (Order By Seq) From cteP)
,cteR2 as (Select A.Seq,A.ID,R2=Max(B.R1) From cteR1 A Join cteR1 B on (B.Seq like A.Seq+'%') Group By A.Seq,A.ID )
Select A.R1
,B.R2
,A.ID
,A.Parent
,A.Lvl
,Title = Replicate(#Nest,A.Lvl-1) + A.ID
,Sales = (Select sum(Sales) from cteR1 S where S.R1 between A.R1 and B.R2)
From cteR1 A
Join cteR2 B on A.ID=B.ID
Group By A.R1,B.R2,A.ID,A.Parent,A.Lvl
Order By A.R1
Returns
Now, If you set #Top = 'MP3 Players' rather than NULL, you'll get :
Just a little narrative:
cte0, we normalize your hierarchy into a Parent/Child relationship
cteP, we build your hierarchy via a recursive cte
cteR1, we generate the sequence/R1 keys
cteR2, we generate the R2 Keys
Now, If yo have slow-moving hierarchies, I tend to store them with the range keys to facilitate navigation and aggregation.

Where clause if there are multiple of the same ID

I have following table:
ID | source | Name | Age | ... | ...
1 | SQL | John | 18 | ... | ...
2 | SAP | Mike | 21 | ... | ...
2 | SQL | Mike | 20 | ... | ...
3 | SAP | Jill | 25 | ... | ...
I want to have one record for each ID. The idea behind this is that if the ID comes only once (no matter the Source), that record will be taken. But, If there are 2 records for one ID, the one containing SQL as source will be the used record here.
So, In this case, the result will be:
ID | source | Name | Age | ... | ...
1 | SQL | John | 18 | ... | ...
2 | SQL | Mike | 20 | ... | ...
3 | SAP | Jill | 25 | ... | ...
I did this with a partition over (ordered by Source desc), but that wouldn't work well if a third source will be added one day.
Any other options/ideas?
The easiest approach(in my opinion) is using a CTE with a ranking function:
with cte as
(
select ID, source, Name, Age, ... ,
rn = row_number() over (partition by ID order by case when source = 'sql'
then 0 else 1 end asc)
from dbo.tablename
)
select ID, source, Name, Age, ...
from cte
where rn = 1
You can use ROW_NUMBER:
WITH CTE AS
(
SELECT *,
RN = ROW_NUMBER() OVER( PARTITION BY ID
ORDER BY CASE WHEN [Source] = 'SQL' THEN 1 ELSE 2 END)
FROM dbo.YourTable
)
SELECT *
FROM CTE
WHERE RN = 1;
You can use the WITH TIES clause and the window function Row_Number()
Select Top 1 With Ties *
From YourTable
Order By Row_Number() over (Partition By ID Order By Case When Source = 'SQL' Then 0 Else 1 End)
How about
SELECT *
FROM table
WHERE ID in (
SELECT ID FROM test
group by ID
having count(ID) = 1)
OR source = 'SQL'

SQL Server - How to check if two items have the same relations to another set of items?

I have table with Employees (tblEmployee):
| ID | Name |
| 1 | Smith |
| 2 | Black |
| 3 | Thompson |
And a table with Roles (tblRoles):
| ID | Name |
| 1 | Submitter |
| 2 | Receiver |
| 3 | Analyzer |
I have also a table with relations of Employees to their Roles with many to many relation type (tblEmployeeRoleRel):
| EmployeeID | RoleID |
| 1 | 1 |
| 1 | 2 |
| 2 | 1 |
| 2 | 2 |
| 2 | 3 |
| 3 | 3 |
I need to select ID, Name from tblEmployee that have exaclty the same set of roles from tblEmployeeRoleRel as has the Employee with ID = 1. How can I do it?
Use a where clause to limit the roles you're looking at to those of employeeID of 1 and use a having clause to make sure that the employee's role count matches that of employee1.
SELECT A.EmployeeID
FROM tblEmployeeRoleRel A
WHERE Exists (SELECT 1
FROM tblEmployeeRoleRel B
WHERE B.EmployeeID = 1
and B.RoleID = A.RoleID)
GROUP BY A.EmployeeID
HAVING count(A.RoleID) = (SELECT count(C.RoleID)
FROM tblEmployeeRoleRel C
WHERE EmployeeID = 1)
This assumes that employeeID and roleID are unique in tblEmployeeRoleRel otherwise we may have to distinct the roleID fields above.
Declare #EmployeeID int = 1 -- change this to whatever employee ID you like, or perhaps you'd pass an Employee ID to it in a stored procedure.
Select Distinct e.EmployeeID -- normally distinct would incur extra overhead, but in this case you only want the employee IDs. not using Distinct when an employee has multiple roles will give you multiple employee IDs.
from tblEmployeeRoleRel as E
where E.EmployeeID not in
(Select EmployeeID from tblEmployeeRoleRel where RoleID not in (Select RoleID from tblEmployeeRoleRel where Employee_ID = #EmployeeID))
and exists (Select EmployeeID from tblEmployeeRoleRel where EmployeeID = e.EmployeeID) -- removes any "null" matches.
and E.Employee_ID <> #Employee_ID -- this keeps the employee ID itself from matching.

Removing lines from a query

I have the following table:
OrderID | OldOrderID | Action | EntryDate | Source
1 | NULL | Insert | 2016-01-12| A
1 | NULL | Remove | 2016-01-13| A
2 | NULL | Insert | 2016-01-12| B
3 | NULL | Insert | 2016-01-12| C
4 | 3 | Insert | 2016-01-13| C
4 | NULL | Remove | 2016-01-14| C
I want to query all orders that are currently active orders - they dont have the action remove. Currently I do it with this query :
WITH Active AS
(
SELECT *, rn = ROW_NUMBER()
OVER (PARTITION BY OrderID,Source ORDER BY EntryDate DESC)
FROM Orders
)
SELECT *
FROM Active WHERE [Action] <> 'Remove' AND rn = 1;
The problem is that some orders get child orders (OrderID 3 gets a child OrderID 4) and if a child ever gets the Action Remove the query should also ignore the parent, but with the current query it dosent.
In short the current query gets me this result:
OrderID | OldOrderID | Action | EntryDate | Source
2 | NULL | Insert | 2016-01-12| B
3 | NULL | Insert | 2016-01-12| C
But I need this result:
OrderID | OldOrderID | Action | EntryDate | Source
2 | NULL | Insert | 2016-01-12| B
Is it possible to fix the query to get a result like this?
Try this:
;WITH CTE AS (
SELECT OrderID, OldOrderID, Action, EntryDate, Source,
COUNT(CASE WHEN Action = 'Remove' THEN 1 END)
OVER (PARTITION BY OrderID) AS IsRemoved,
ROW_NUMBER() OVER (PARTITION BY OrderID ORDER BY EntryDate) AS rn
FROM Orders
)
SELECT c1.*
FROM CTE AS c1
LEFT JOIN CTE AS c2 ON c1.OrderID = c2.OldOrderID AND c2.IsRemoved >= 1
WHERE c1.rn = 1 AND c1.IsRemoved = 0 AND c2.IsRemoved IS NULL
The above query uses COUNT() OVER() in order to count the number of occurrences of Action = 'Remove' within each OrderID partition. Hence, a value of IsRemoved that is equal to or greater than 1 identifies a 'removed' order.
I also asked the question on dba stackexchange and got the following answer, which works well.

SQL Query combine multiple results or Join tables

I have the Following tables
Disposition Table
Dis_ID | OfferID | RequestID
------------------------------------
34564 | 123 | 9
77456 | 123 | 8
25252 | 124 | 7
46464 | 125 | 10
36464 | 125 | 6
35353 | 125 | 5
Request Table
RequestID | AccountNum |
---------------------------
5 | 548543 |
6 | 548543 |
7 | 684567 |
8 | 684567 |
9 | 684567 |
10 | 548543 |
11 | 684567 |
Rank Table
RankID | OfferId | RequestID | Score
-------------------------------------------
34564 | 123 | 11 | 1
77456 | 124 | 11 | 2
25252 | 125 | 11 | 3
Using the data above I need a query which would behave as follows given a request number look at every record in the Rank Table in this example we have 3 (123, 124, & 125). return the OfferId that appears the fewest times in the Disposition table for this joined account number. in this example offerId 123 appears twice for this account number, offerId 124 appears once and offerId 125 doesn't appears at all for this account number. So offerId 125 should be returned. The offerId which exist in the Rank Table with the fewest appearances in the Disposition table should always be returned unless they are all the same then return the offerId with the lowest value in the Score field. for example if none of the offerIDs appeared in the Dispostion table offerId 123 would return since its Score value is 1.
Resulting table would look something like this
| OfferId | Score | Dis_Occurrences
---------------------------------------------------------------
| 123 | 1 | 2
| 124 | 2 | 1
| 125 | 3 | 0 <--Return this record
This is what I have so far.
SELECT oRank.OfferId, oRank.Rank_Number, count(oRank.OfferId) AS NumDispositions
From Rank oRank
join Request req
on oRank.RequestId = req.RequestId
join Disposition dis
on oRank.OfferId = dis.OfferId
where req.Customer_Account_Number = 684567 and req.RequestId = 11 and oRank.OfferId = dis.OfferId
group by oRank.Rank_Number, oRank.OfferId
order by NumDispositions, oRank.Rank_Number
My incorrect Resulting table looks like this
| OfferId | Score | Dis_Occurrences
---------------------------------------------------------------
| 123 | 1 | 2
| 124 | 2 | 1
| 125 | 3 | 3
It is counting the total number of times the offerId appears in the Disposition Table
EDIT - based on author's comments, here's another version:
Example in SQLFiddle: http://sqlfiddle.com/#!6/d3f99/1/0
with RankReqMap as (
select rnk.OfferId, rnk.Score, reqAcct.AccountNum, reqReq.RequestID
from [Rank] rnk
left join Request reqAcct on reqAcct.RequestID = rnk.RequestID
left join Request reqReq on reqReq.AccountNum = reqAcct.AccountNum
where rnk.RequestID = 11 -- Put your RequestId filter here
)
select oRank.OfferId
,oRank.Score
,count(dis.RequestID) as NumDispositions
from RankReqMap oRank
left join Disposition dis on dis.OfferID = oRank.OfferId
and dis.RequestID = oRank.RequestID
group by oRank.OfferId , oRank.Score
order by NumDispositions, oRank.Score;
ORIGINAL POST
Example in SQLFiddle: http://sqlfiddle.com/#!6/770a8/1/0
This query makes the assumption that you're joining Disposition to Rank based on OfferID, since the RequestIDs for those tables in your example data don't match up. You may have to tweak depending on your needs, but something like the query below should get you the record you're looking for:
-- Gather base data
with RankData as (
select rnk.RankID
,rnk.OfferID
,rnk.RequestID
,rnk.Score
,Dis_Occurrences = count(dis.OfferID)
from dbo.[Rank] rnk
left join dbo.Disposition dis on dis.OfferID = rnk.OfferId
left join dbo.Request req on req.RequestID = rnk.RequestID
group by rnk.RankID, rnk.OfferID, rnk.RequestID, rnk.Score
)
-- Rank count of Dis_Occurrences, taking lowest score into account as a tie breaker
, DispRanking as (
select rdt.*, Dis_Rank = row_number() over (order by Dis_Occurrences asc, rdt.Score asc)
from RankData rdt
)
-- Return only the value with the highest ranking
select * from DispRanking where Dis_Rank = 1
Note also that if you convert the second CTE into a naked SELECT and remove the SELECT statement at the end, you can see all of the records and how they get ranked by the row_number() function:
-- Gather base data
with RankData as (
select rnk.RankID
,rnk.OfferID
,rnk.RequestID
,rnk.Score
,Dis_Occurrences = count(dis.OfferID)
from dbo.[Rank] rnk
left join dbo.Disposition dis on dis.OfferID = rnk.OfferId
left join dbo.Request req on req.RequestID = rnk.RequestID
group by rnk.RankID, rnk.OfferID, rnk.RequestID, rnk.Score
)
-- Output all values, with rankings
select rdt.*, Dis_Rank = row_number() over (order by Dis_Occurrences asc, rdt.Score asc)
from RankData rdt
Good luck!
I think you can use window function for this:
;with disp as(select offerid, count(*) as ocount
from dispositions group by offerid),
rnk as(select r.offerid,
row_number() over(partition by r.requestid
order by isnull(d.ocount, 0), r.score) rn
from ranks r
left join disp d on r.offerid = d.offerid)
select * from rnk where rn = 1

Resources