Recursive Query using CTE in SQL Server 2005 - sql-server

OK, here's what I'm trying to do. I'm using a CTE query in MSSQL2005. The objective of the query is to recurse through Parent child relationships of product categories and return the number of products under each category (this includes any products contained in children categories)
My current version only returns the product count for the category being displayed. It's not accounting for products that may be contained within any of its children.
The database dump to reproduce the problem, along with the query I used and explanation follows below:
CREATE TABLE [Categories] (
[CategoryID] INT,
[Name] NCHAR(150)
)
GO
/* Data for the `Query_Result` table (Records 1 - 5) */
INSERT INTO [Categories] ([CategoryID], [Name])
VALUES (942, N'Diagnostic Equipment')
GO
INSERT INTO [Categories] ([CategoryID], [Name])
VALUES (943, N'Cardiology')
GO
INSERT INTO [Categories] ([CategoryID], [Name])
VALUES (959, N'Electrodes')
GO
INSERT INTO [Categories] ([CategoryID], [Name])
VALUES (960, N'Stress Systems')
GO
INSERT INTO [Categories] ([CategoryID], [Name])
VALUES (961, N'EKG Machines')
GO
CREATE TABLE [Categories_XREF] (
[CatXRefID] INT,
[CategoryID] INT,
[ParentID] INT
)
GO
/* Data for the `Query_Result` table (Records 1 - 5) */
INSERT INTO [Categories_XREF] ([CatXRefID], [CategoryID], [ParentID])
VALUES (827, 942, 0)
GO
INSERT INTO [Categories_XREF] ([CatXRefID], [CategoryID], [ParentID])
VALUES (828, 943, 942)
GO
INSERT INTO [Categories_XREF] ([CatXRefID], [CategoryID], [ParentID])
VALUES (928, 959, 943)
GO
INSERT INTO [Categories_XREF] ([CatXRefID], [CategoryID], [ParentID])
VALUES (929, 960, 943)
GO
INSERT INTO [Categories_XREF] ([CatXRefID], [CategoryID], [ParentID])
VALUES (930, 961, 943)
GO
CREATE TABLE [Products_Categories_XREF] (
[ID] INT,
[ProductID] INT,
[CategoryID] INT
)
GO
/* Data for the `Query_Result` table (Records 1 - 13) */
INSERT INTO [Products_Categories_XREF] ([ID], [ProductID], [CategoryID])
VALUES (252065, 12684, 961)
GO
INSERT INTO [Products_Categories_XREF] ([ID], [ProductID], [CategoryID])
VALUES (252066, 12685, 959)
GO
INSERT INTO [Products_Categories_XREF] ([ID], [ProductID], [CategoryID])
VALUES (252067, 12686, 960)
GO
INSERT INTO [Products_Categories_XREF] ([ID], [ProductID], [CategoryID])
VALUES (252068, 12687, 961)
GO
INSERT INTO [Products_Categories_XREF] ([ID], [ProductID], [CategoryID])
VALUES (252128, 12738, 961)
GO
INSERT INTO [Products_Categories_XREF] ([ID], [ProductID], [CategoryID])
VALUES (252129, 12739, 959)
GO
INSERT INTO [Products_Categories_XREF] ([ID], [ProductID], [CategoryID])
VALUES (252130, 12740, 959)
GO
INSERT INTO [Products_Categories_XREF] ([ID], [ProductID], [CategoryID])
VALUES (252131, 12741, 959)
GO
INSERT INTO [Products_Categories_XREF] ([ID], [ProductID], [CategoryID])
VALUES (252132, 12742, 959)
GO
INSERT INTO [Products_Categories_XREF] ([ID], [ProductID], [CategoryID])
VALUES (252133, 12743, 959)
GO
INSERT INTO [Products_Categories_XREF] ([ID], [ProductID], [CategoryID])
VALUES (252134, 12744, 959)
GO
INSERT INTO [Products_Categories_XREF] ([ID], [ProductID], [CategoryID])
VALUES (252135, 12745, 959)
GO
INSERT INTO [Products_Categories_XREF] ([ID], [ProductID], [CategoryID])
VALUES (252136, 12746, 959)
GO
CREATE TABLE [Products] (
[ProductID] INT
)
GO
/* Data for the `Query_Result` table (Records 1 - 13) */
INSERT INTO [Products] ([ProductID])
VALUES (12684)
GO
INSERT INTO [Products] ([ProductID])
VALUES (12685)
GO
INSERT INTO [Products] ([ProductID])
VALUES (12686)
GO
INSERT INTO [Products] ([ProductID])
VALUES (12687)
GO
INSERT INTO [Products] ([ProductID])
VALUES (12738)
GO
INSERT INTO [Products] ([ProductID])
VALUES (12739)
GO
INSERT INTO [Products] ([ProductID])
VALUES (12740)
GO
INSERT INTO [Products] ([ProductID])
VALUES (12741)
GO
INSERT INTO [Products] ([ProductID])
VALUES (12742)
GO
INSERT INTO [Products] ([ProductID])
VALUES (12743)
GO
INSERT INTO [Products] ([ProductID])
VALUES (12744)
GO
INSERT INTO [Products] ([ProductID])
VALUES (12745)
GO
INSERT INTO [Products] ([ProductID])
VALUES (12746)
GO
Here's the CTE query I was using:
WITH ProductCategories (CategoryID, ParentID, [Name], Level)
AS
(
-- Anchor member definition
SELECT
C.CategoryID,
CXR.ParentID,
C.Name,
0 AS Level
FROM
Categories C,
Categories_XRef CXR
WHERE
C.CategoryID = CXR.CategoryID
AND CXR.ParentID = 0
UNION ALL
-- Recursive member definition
SELECT
C.CategoryID,
CXR.ParentID,
C.Name,
Level + 1
FROM
Categories C,
Categories_XRef CXR,
ProductCategories AS PC
WHERE
C.CategoryID = CXR.CategoryID
AND CXR.ParentID = PC.CategoryID
)
SELECT
PC.ParentID,
PC.CategoryID,
PC.Name,
PC.Level,
(SELECT
Count(P.ProductID)
FROM
Products P,
Products_Categories_XREF PCXR
WHERE
P.ProductID = PCXR.ProductID
AND PCXR.CategoryID = PC.CategoryID
) as ProductCount
FROM
Categories C,
ProductCategories PC
WHERE
PC.CategoryID = C.CategoryID
AND PC.ParentID = 943
ORDER BY
Level, PC.Name
First, change the "PC.ParentID" to 943. You'll see three records returned showing the product Count for each category being displayed.
Now, change the ParentID from 943 to 942 and re-run it. You'll now see 1 result returned called "Cardiology", but it shows 0 products
Under this category, there are children (who you previously saw) who contain products. My big question is, at this level (Parent 942) how can I make it count the products contained in the children below to show 13 as the "ProductCount" I'm kinda thinking I may need one more recursion method. I tried that, but had no success.
I'm open to a stored procedure that would do what I'm looking for. I'm not set on one particular way. So any other suggestions would be appreciated.

edit OK having actually read the requirements and thought a bit this is actually quite easy (I think!)
The point is that we want two things: the category hierarchy, and a count of products. The hierarchy is done by a recursive CTE, and counting is done outside that:
-- The CTE returns the cat hierarchy:
-- one row for each ancestor-descendant relationship
-- (including the self-relationship for each category)
WITH CategoryHierarchy AS (
-- Anchor member: self relationship for each category
SELECT CategoryID AS Ancestor, CategoryID AS Descendant
FROM Categories
UNION ALL
-- Recursive member: for each row, select the children
SELECT ParentCategory.Ancestor, Children.CategoryID
FROM
CategoryHierarchy AS ParentCategory
INNER JOIN Categories_XREF AS Children
ON ParentCategory.Descendant = Children.ParentID
)
SELECT CH.Ancestor, COUNT(ProductID) AS ProductsInTree
-- outer join to product-categories to include
-- all categories, even those with no products directly associated
FROM CategoryHierarchy CH
LEFT JOIN Products_Categories_XREF PC
ON CH.Descendant = PC.CategoryID
GROUP BY CH.Ancestor
The results are:
Ancestor ProductsInTree
----------- --------------
942 13
943 13
959 9
960 1
961 3
I am indebted to this article by the inestimable Itzik Ben-Gan for getting my thinking kick-started. His book 'Inside MS SQL Server 2005: T-SQL Querying' is highly recommended.

Your WHERE statement limits the result to one parent. If you'd like to see all children below 942, specify 942 as the root in the CTE. For example:
WITH CTE (CategoryID, ParentID, [Name], [Level])
AS
(
SELECT C.CategoryID, CXR.ParentID, C.Name, 0 AS Level
FROM Categories C
INNER JOIN Categories_XRef CXR ON C.CategoryID = CXR.CategoryID
WHERE CXR.CategoryID = 943
UNION ALL
SELECT C.CategoryID, CXR.ParentID, C.Name, Level + 1
FROM Categories C
INNER JOIN Categories_XRef CXR ON C.CategoryID = CXR.CategoryID
INNER JOIN CTE PC ON PC.CategoryID = CXR.ParentID
)
SELECT * FROM CTE
By the way, can categories can have multiple parents? If not, consider eliminating the Categories_XREF table and storing ParentID in the Categories table.

Related

Query for relationships between rows

I need to find a relation between multiple person in single table, for example I have the below table:
Guests Table
so I need by sql script to say Guest 123 and 456 they checked in together to the same hotel in the same time 80% and so on...
Kindly support.
It's a little complicated so I've broken it down into multiple subqueries for you using a CTE with a matched key.
This will produce a series of matched pairs - for the primary guest and secondary guest with ratios of how often they stay together rather than just check in.
Setup:
create table temp(
hotelID integer,
checkInDate date,
guestID integer
)
insert into temp values (101, '2020/06/01', 123)
insert into temp values (101, '2020/06/01', 456)
insert into temp values (102, '2020/06/15', 123)
insert into temp values (102, '2020/06/15', 456)
insert into temp values (103, '2020/06/30', 123)
insert into temp values (103, '2020/06/30', 456)
insert into temp values (104, '2020/07/15', 123)
insert into temp values (104, '2020/07/15', 789)
insert into temp values (105, '2020/07/01', 456)
insert into temp values (105, '2020/07/01', 789)
Query:
with keyCte as (
select
distinct cast(hotelID as varchar(3)) + cast(checkInDate as varchar(10)) as myKey,
guestID
from temp
)
select
guestPrime
, guestTwo
, instances as guestPrimeStays
, matches as guestTwoMatches
, cast(matches as float) / cast(instances as float) as hitRate
from (
select
guestID
, count(*) as instances
from keyCte
group by guestID
) sq3
join (
select
guestPrime
, guestTwo
, count(*) as matches
from (
select
keyCte.guestID as guestPrime
, kcte.guestID as guestTwo
from keyCte
join keyCte kcte on kcte.myKey = keyCte.myKey and kcte.guestID != keyCte.guestID
) sq
group by guestPrime, guestTwo
) sq2 on sq2.guestPrime = guestID

How to fix this error 'The column 'DistrictID' was specified multiple times for 'piv'

I want to show the inspector name, inspector post, district and project with inspection done by the inspector according to months for which i'm using pivot but i'm getting
"The column 'DistrictID' was specified multiple times for 'piv'." this error...
Please help me to get over this error
Declare #SQLQuery nvarchar(MAX)
If(OBJECT_ID('tempdb..#TBL1') Is Not Null)
Begin
Drop Table #TBL1
End
CREATE TABLE #TBL1
(
ID int,
InspPost nvarchar (MAX),
InspPostHin nvarchar(MAX)
)
SET #SQLQuery ='INSERT into #TBL1 ([ID], [InspPost], [InspPostHin]) VALUES (1, N''Child Development Project Officer'', N''??? ????? ???????? ?????????'')
INSERT into #TBL1 ([ID], [InspPost], [InspPostHin]) VALUES (2, N''Lady Superviser'', N''????? ????????????'')
INSERT into #TBL1 ([ID], [InspPost], [InspPostHin]) VALUES (3, N''Other'', N''???? ?????'')
INSERT into #TBL1 ([ID], [InspPost], [InspPostHin]) VALUES (4, N''District Program Officer'', N''???? ????????? ?????????'')
INSERT into #TBL1 ([ID], [InspPost], [InspPostHin]) VALUES (5, N''J.P.C/State Level Officer'',N''??.??.??../???? ???????? ????? ?????? ???????'')
INSERT into #TBL1 ([ID], [InspPost], [InspPostHin]) VALUES (6, N''S.P.M.U/Technical Consultant'', N''??.??..??.??. - ?????? ???????'')
INSERT into #TBL1 ([ID], [InspPost], [InspPostHin]) VALUES (7, N''District Coordinator'', N''???? ???????'')
INSERT into #TBL1 ([ID], [InspPost], [InspPostHin]) VALUES (8, N''Project Coordinator'', N''?????? ???????'')
INSERT into #TBL1 ([ID], [InspPost], [InspPostHin]) VALUES (9, N''Swasth Bharat Prerak'', N''?????? ???? ??????'')'
exec (#SQLQuery)
select * from
(
select Districtmaster.DistrictID,ProjectMaster.ProjectID,Districtmaster.DistrictNameHn,ProjectMaster.ProjectNameHn from Districtmaster Districtmaster
inner join ProjectMaster ProjectMaster on Districtmaster.DistrictID=ProjectMaster.DistID
) a1
inner join
(
select Supervision_Checklist.ID,Supervision_Checklist.Inspector_Name,
Supervision_Checklist.DistrictID,Supervision_Checklist.ProjectID,
Supervision_Checklist.Inspector_Type,(#TBL1.InspPost) as inptype ,Supervision_Checklist.Month
from Supervision_Checklist Supervision_Checklist
inner join #TBL1 #TBL1
on Supervision_Checklist.Inspector_Type=#TBL1.ID
) src on a1.DistrictID=src.DistrictID and a1.ProjectID=src.ProjectID
pivot (count(id) for Month in ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12])) piv
i want result as following...
enter image description here
A PIVOT effectively performs a GROUP BY on all columns which are currently in the result set and which aren't mentioned in the pivot.
If you've done a bit of joining, it's highly likely (as here) that you'll end up with multiple columns of the same name (DistrictID). Even though we know that both DistrictID values are equal on every row, the optimizer doesn't, and generates this error.
What we need to do is project away the columns that we don't want included in this grouping operation. But there's no convenient way to express this "inline". We need to have a SELECT clause and we need the PIVOT to be on the "other side" of that SELECT clause.
Typically, we'd do this using a subquery or CTE:
;With AllResults as (
select
id,
month,
/* We cannot use * here. We need only those columns needed by the pivot
or which should appear in the final result */
from
(
select Districtmaster.DistrictID,ProjectMaster.ProjectID,Districtmaster.DistrictNameHn,ProjectMaster.ProjectNameHn from Districtmaster Districtmaster
inner join ProjectMaster ProjectMaster on Districtmaster.DistrictID=ProjectMaster.DistID
) a1
inner join
(
select Supervision_Checklist.ID,Supervision_Checklist.Inspector_Name,
Supervision_Checklist.DistrictID,Supervision_Checklist.ProjectID,
Supervision_Checklist.Inspector_Type,(#TBL1.InspPost) as inptype ,Supervision_Checklist.Month
from Supervision_Checklist Supervision_Checklist
inner join #TBL1 #TBL1
on Supervision_Checklist.Inspector_Type=#TBL1.ID
) src on a1.DistrictID=src.DistrictID and a1.ProjectID=src.ProjectID
)
select *
from AllResults
pivot (count(id) for Month in ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12])) piv

How to update table when rows of one table are different from the other

I have a table table1 and a temporary table temp2. Temp2 contains updated values which i want to update in table1. So, for any rows that are different i want to update the values from Temp2 to table 1. I tried something like this but its not working.
update Role_Master set Role_Desc=Role_Descc , Role_Version_Number =Role_Version_Number+1,Role_Dept=Role_Deptt,Role_All_Clients=Role_All_Clientss,
Role_Admin=Role_Adminn,Role_Super_Admin=Role_Super_Adminn,Role_Modified_Date = GETDATE(),Role_Modified_By = 'T6086' FROM #TEMP1 where Role_ID in
(SELECT #TEMP1.Role_IDD FROM #TEMP1 LEFT JOIN Role_Master ON (#TEMP1.Role_Descc = Role_Master.Role_Desc and #Temp1.Role_Deptt=Role_Master.Role_Dept)
WHERE Role_Master.Role_Desc is null and Role_Master.Role_Dept IS NULL)
hard to help you without knowing the schema of the two tables ... but it should be possible to join the two tables and decide by a where condition which rows to update ... check out this simple example ... maybe it helps
create table #temp1 (id int, val nvarchar(100))
create table #temp2 (id int, val nvarchar(100))
insert into #temp1 (id, val) values (1, 'eins')
insert into #temp1 (id, val) values (2, 'eins')
insert into #temp1 (id, val) values (3, 'eins')
insert into #temp2 (id, val) values (1, 'zwei')
insert into #temp2 (id, val) values (2, 'eins')
insert into #temp2 (id, val) values (3, 'eins')
update #temp1 set #temp1.val = b.val
from #temp1 a join #temp2 b on a.id = b.id
where a.val <> b.val
select ##rowcount -- returns 1 because 1 row was updated
select * from #temp1

Left join results in extra records

This is a basic left join problem and I have read many articles explaining what is going on but somehow the resolution is not clicking in my head. My left table has unique records. My right table has several records for each record in the left.
In the articles I have been reading this is often explained as left table has customers and right table has orders. That is very similar but not exactly what I am facing.
In my situation the left table has unique records and the right has repetitive data to be migrated into db the left table is in. So I am trying to write a query that will join on the key shared by both but I only need one record from the right. The results I am getting of course have multiple records since the single left matches multiple times on the right.
I am thinking I need to add some sort of filtering such as Top(1) but still reading / learning and wanted to get feedback / direction from the brainiacs on this list.
Here is a simple schema of what I am working with:
DECLARE #Customer TABLE
(
Id int,
Name varchar(50),
email varchar(50)
)
INSERT #Customer VALUES(1, 'Frodo', 'frodo#middleearth.org')
INSERT #Customer VALUES(2, 'Bilbo', 'Bilbo#middleearth.org')
INSERT #Customer VALUES(3, 'Galadriel', 'Galadriel#middleearth.org')
INSERT #Customer VALUES(4, 'Arwen', 'Arwen#middleearth.org')
INSERT #Customer VALUES(5, 'Gandalf', 'Gandalf#middleearth.org')
DECLARE #CustomerJobs TABLE
(
Id int,
email varchar(50),
jobname varchar(50)
)
INSERT #CustomerJobs VALUES(1, 'frodo#middleearth.org', 'RingBearer')
INSERT #CustomerJobs VALUES(2, 'frodo#middleearth.org', 'RingBearer')
INSERT #CustomerJobs VALUES(3, 'frodo#middleearth.org', 'RingBearer')
INSERT #CustomerJobs VALUES(4, 'frodo#middleearth.org', 'RingBearer')
INSERT #CustomerJobs VALUES(5, 'frodo#middleearth.org', 'RingBearer')
INSERT #CustomerJobs VALUES(6, 'Bilbo#middleearth.org', 'Burglar')
INSERT #CustomerJobs VALUES(7, 'Bilbo#middleearth.org', 'Burglar')
INSERT #CustomerJobs VALUES(8, 'Bilbo#middleearth.org', 'Burglar')
INSERT #CustomerJobs VALUES(9, 'Galadriel#middleearth.org', 'MindReader')
INSERT #CustomerJobs VALUES(10, 'Arwen#middleearth.org', 'Evenstar')
INSERT #CustomerJobs VALUES(10, 'Arwen#middleearth.org', 'Evenstar')
INSERT #CustomerJobs VALUES(11, 'Gandalf#middleearth.org', 'WhiteWizard')
INSERT #CustomerJobs VALUES(12, 'Gandalf#middleearth.org', 'WhiteWizard')
SELECT
Cust.Name,
Cust.email,
CJobs.jobname
FROM
#Customer Cust
LEFT JOIN #CustomerJobs CJobs ON
Cjobs.email = Cust.email
I'm toying with row_number over partition() as maybe I should be joining to a cte with the row_number over partition instead of the table itself???
One other constraint I have is I can't delete the duplicates from the right table.
So again my apologies for the simplistic question and thank you for the help.
Instead of using a left join, use an outer apply... you can then use the top clause to limit the rows returned...
select
Cust.Name
, Cust.email
, CJobs.jobname
from #Customer Cust
outer apply (
select top 1 *
from #CustomerJobs CJobs
where Cjobs.email = Cust.email
) cjobs;
You have to come up with some artificial method of reducing the second table to one row per email. For example:
SELECT
Cust.Name,
Cust.ID,
Cust.email,
CJobs.jobname
FROM
#Customer Cust
LEFT JOIN
(select min(id) as id,email, jobname
from
#CustomerJobs
group by email, jobname) as CJobs ON
Cjobs.email = Cust.email
But that's pretty much random. Is there a way to determine which row from your CustomerJobs table is the "right" one?
SELECT DISTINCT
Cust.Name,
Cust.email,
CJobs.jobname
FROM
#Customer Cust
LEFT JOIN #CustomerJobs CJobs ON
Cjobs.email = Cust.email
The additional of the DISTINCT keyword should get you what you want.
This will work:
SELECT
Cust.Name,
Cust.ID,
Cust.email,
CJobs.jobname
FROM #Customer Cust
LEFT JOIN
(SELECT DISTINCT email, jobname
FROM #CustomerJobs) C2 ON C2.email = C.email

Smart Many to Many Query

I have a list of item descriptions in a c# application. What I want is when I select
1 or 2 or more item descriptions of that list (checkbox list) to predict via an sql query to a many to many table what my item is (minimizing each time the possible predictions);
For example
item 1: white,green,blue
item 2: white,red,cyan
item 3: red,blue,purple
user should select from a check list
white->query will return item 1,2
white&green->query will return only item 1
From your humble description of the problem, I suppose you want something like this:
CREATE TABLE items (
item_id INT NOT NULL PRIMARY KEY IDENTITY(1,1),
name VARCHAR(100) NOT NULL
)
CREATE TABLE colors (
color_id INT NOT NULL PRIMARY KEY IDENTITY(1,1),
name VARCHAR(100) NOT NULL
)
CREATE TABLE items_colors (
item_id INT NOT NULL FOREIGN KEY REFERENCES items(item_id),
color_id INT NOT NULL FOREIGN KEY REFERENCES colors(color_id),
PRIMARY KEY(item_id, color_id),
)
INSERT INTO items(name) VALUES ('item 1')
INSERT INTO items(name) VALUES ('item 2')
INSERT INTO items(name) VALUES ('item 3')
INSERT INTO colors(name) VALUES ('white')
INSERT INTO colors(name) VALUES ('green')
INSERT INTO colors(name) VALUES ('blue')
INSERT INTO colors(name) VALUES ('red')
INSERT INTO colors(name) VALUES ('cyan')
INSERT INTO colors(name) VALUES ('purple')
INSERT INTO items_colors(item_id, color_id) VALUES (1, 1)
INSERT INTO items_colors(item_id, color_id) VALUES (1, 2)
INSERT INTO items_colors(item_id, color_id) VALUES (1, 3)
INSERT INTO items_colors(item_id, color_id) VALUES (2, 1)
INSERT INTO items_colors(item_id, color_id) VALUES (2, 4)
INSERT INTO items_colors(item_id, color_id) VALUES (2, 5)
INSERT INTO items_colors(item_id, color_id) VALUES (3, 3)
INSERT INTO items_colors(item_id, color_id) VALUES (3, 4)
INSERT INTO items_colors(item_id, color_id) VALUES (3, 6)
SELECT i.*
FROM items i
WHERE 2 = (
SELECT COUNT(*)
FROM items_colors ic
JOIN colors c
ON ic.color_id = c.color_id
WHERE i.item_id = ic.item_id
AND c.name IN ('white', 'green')
)
Within "IN" clause you should provide list of values that user has selected in the UI (you have to build list of parameters dynamically). You also have to provide number of elements that user has selected ("2" in my example solution).
So the query in application will look like this:
SELECT i.*
FROM items i
WHERE #count = (
SELECT COUNT(*)
FROM items_colors ic
JOIN colors c
ON ic.color_id = c.color_id
WHERE i.item_id = ic.item_id
AND c.name IN (#color1, #color2, ..., #colorN)
)
(Where #count is the number of #colorX parameters.)

Resources