Match id or masterid or masterid of masterid etc - sql-server

I have a Location table:
Id int
MasterLocationId int
Name varchar(50)
A location can have a master location and so on. There's no specific limit to how many levels this could be.
How can I search for any locations with a certain ID, or locations where a master record has that ID, or its master record has that ID - and so on.
I'm not really sure what to even search for on Google here - perhaps this type of situation has a name and I'm not sure what it's called?
I've searched for recursive TSQL and couldn't find anything.
Is this possible?
Thanks

As #Isaac and #Jeroen said in comments, recursive CTE is what you need.
Sample data:
CREATE TABLE Locations (
Id int,
MasterLocationId int,
Name varchar(50)
);
insert into Locations (Id, Name, MasterLocationId)
values
(1, 'Alice', null),
(2, 'Bob', 1),
(3, 'Charlie', 2),
(4, 'Dave', 3),
(5, 'Erin', 4),
(6, 'Frank', 5),
(7, 'Grace', 6),
(8, 'Heidi', 7),
(9, 'Ivan', 8),
(10,'Judy', 9),
(11,'Kevin', 10),
(12,'Lucy', 6),
(13,'Mike', 7),
(14,'Noah', 8),
(15,'Olivia', 9),
(16,'Peggy', 10),
(17,'Rupert', 6),
(18,'Sybil', 7),
(19,'Ted', 8),
(20,'Trudy', 9),
(21,'Uma', 10),
(22,'Victor', 11),
(23,'Walter', 22),
(24,'Xavier', 23),
(25,'Yves', 24),
(26,'Zoe', 25);
And a query:
;
with Locations_CTE as (
-- anchor of 1st tier parents
select L1.Id, L1.Name, L1.MasterLocationId, L2.Name as MasterLocationName, 1 as MasterLocationTier
from Locations as L1
left join Locations as L2
on L1.MasterLocationId = L2.Id
-- recursive part
union all
select L1.Id, L1.Name, L2.MasterLocationId, L3.Name as MasterLocationName, L1.MasterLocationTier + 1 as MasterLocationTier
from Locations_CTE as L1
inner join Locations as L2
on L1.MasterLocationId = L2.Id
inner join Locations as L3
on L2.MasterLocationId = L3.Id
)
select *
from Locations_CTE
where MasterLocationId = 11 -- Find all locations which have Kevin as MasterLocation somewhere in a hierarchy.
or Id = 11 -- And full hierarchy for Kevin
order by Id, MasterLocationTier

Related

SQL: Join fact on dimension using bridge table and return concatenated list for column of interest

I would like to join multiple dimension tables to a fact table. However, instead of returning all the results, I want to avoid a one-to-many relationship result. Instead, I would like to access just some of the data in one table, concatenate all of its findings, and return them into a single column so that the expected result of 1 row per fact remains. I do not want a one-to-many relationship in my result.
If I pair this answer How to concatenate text from multiple rows into a single text string in SQL Server with additional columns of interest, I get an error, saying: is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause
-- sample values
-- bridge
dim2Key, groupKey
1, 1
2, 1
3, 1
4, 2
-- dim2
dim2Key, attributeOne
1, 'A'
2, 'B'
3, 'C'
4, 'A'
-- dim1
dim1Key, attributeTwo, attributeThree,
1, 35, 'val1'
2, 25, 'val2'
3, 45, 'val3'
4, 55, 'val1'
-- fact1
dim1Key, factvalue1, groupKey,
1, 5, 1
2, 25, 1
3, 55, -1
4, 99, 2
-- expected values
-- fact1
dim1Key, factvalue1, groupKey, attributeTwo, attributeThree, attributeOne
1, 5, 1, 35, 'val1', 'A, B, C'
...
4, 99, 2, 55, 'val1', 'A'
It's very unclear what your schema and joins should be, but it seems you want to aggregate dim2 per row of fact1.
You can aggregate dim2 inside a correlated subquery. It's often nicer to put this in an APPLY but you could also put it directly in the SELECT
SELECT
fact1.dim1Key,
fact1.factvalue1,
fact1.groupKey,
dim1.attributeTwo,
dim1.attributeThree,
dim2.attributeOne
FROM fact1
JOIN dim1 ON dim1.dim1key = fact1.dim1key
CROSS APPLY (
SELECT attributeOne = STRING_AGG(dim2.attributeOne, ', ')
FROM bridge b
JOIN dim2 ON dim2.dim2key = b.dim2key
WHERE b.groupKey = fact1.groupKey
) dim2
If you don't have access to the STRING_AGG function in your version of SQL, you can use FOR XML PATH to achieve the same thing.
CREATE TABLE #bridge (dim2Key int, groupKey int)
INSERT #bridge (dim2Key, groupKey)
VALUES (1, 1)
,(2, 1)
,(3, 1)
,(4, 2)
CREATE TABLE #dim2 (dim2Key int, attributeOne varchar(5))
INSERT #dim2 (dim2Key, attributeOne)
VALUES (1, 'A')
,(2, 'B')
,(3, 'C')
,(4, 'A')
CREATE TABLE #dim1 (dim1Key int, attributeTwo int, attributeThree varchar(5))
INSERT #dim1 (dim1Key, attributeTwo, attributeThree)
VALUES (1, 35, 'val1')
,(2, 25, 'val2')
,(3, 45, 'val3')
,(4, 55, 'val1')
CREATE TABLE #fact1 (dim1Key int, factvalue1 int, groupKey int)
INSERT #fact1 (dim1Key, factvalue1, groupKey)
VALUES (1, 5, 1)
,(2, 25, 1)
,(3, 55, -1)
,(4, 99, 2)
GO
;WITH pvt (groupKey, attributeOne)
AS
(
SELECT b.groupKey, d2.attributeOne
FROM #dim2 d2
JOIN #bridge b
ON d2.dim2Key = b.dim2Key
)
, dim2 AS
(
SELECT DISTINCT a.groupKey
,LEFT(r.attributeOne.value('text()[1]','nvarchar(max)') , LEN(r.attributeOne.value('text()[1]','nvarchar(max)'))-1) attributeOne
FROM pvt a
CROSS APPLY
(
SELECT attributeOne + ', '
FROM pvt r
WHERE a.groupKey = r.groupKey
FOR XML PATH(''), TYPE
) r (attributeOne)
)
SELECT f1.dim1Key, factvalue1, f1.groupKey, attributeTwo, attributeThree, attributeOne
FROM #fact1 f1
LEFT JOIN #dim1 d1
ON f1.dim1Key = d1.dim1Key
LEFT JOIN dim2 d2
ON f1.groupKey = d2.groupKey

I have 10 employee and real time data is coming, while I need to insert each records equally distribution (Insert records)

SQL Server Query help: I have 10 employees and real time data is coming, while I need to Insert each records equally distribution.
Example:
Data=1 then EmployeeId=1, Next time Data=2 should be Insert to
EmployeeId=2 and this cycle will continue as per received raw data.
USE tempdb;
GO
/* Just setting up some tables to test with... */
CREATE TABLE dbo.Employee (
EmployeeID INT NOT NULL
CONSTRAINT pk_Employee
PRIMARY KEY CLUSTERED,
FirstName VARCHAR(20) NOT NULL,
LastName VARCHAR(20) NOT NULL,
DepartmentID TINYINT NOT NULL,
PrimaryJobTitle VARCHAR(20) NOT NULL
);
GO
/* Note: I'm assuming that only a certain subset of employees will be assigned "data".
In this case, those employees are in department 4, with a primary jobt itle of "Do Stuff"... */
INSERT dbo.Employee (EmployeeID, FirstName, LastName, DepartmentID, PrimaryJobTitle) VALUES
( 1, 'Jane', 'Doe', 1, 'CEO'),
( 2, 'Alex', 'Doe', 2, 'CIO'),
( 3, 'Bart', 'Doe', 3, 'CFO'),
( 4, 'Cami', 'Doe', 4, 'COO'),
( 5, 'Dolt', 'Doe', 3, 'Accountant'),
( 6, 'Elen', 'Doe', 4, 'Production Manager'),
( 7, 'Flip', 'Doe', 4, 'Do Stuff'),
( 8, 'Gary', 'Doe', 4, 'Do Stuff'),
( 9, 'Hary', 'Doe', 2, 'App Dev'),
(10, 'Jill', 'Doe', 4, 'Do Stuff'),
(11, 'Kent', 'Doe', 4, 'Do Stuff'),
(12, 'Lary', 'Doe', 4, 'Do Stuff'),
(13, 'Many', 'Doe', 4, 'Do Stuff'),
(14, 'Norm', 'Doe', 4, 'Do Stuff'),
(15, 'Paul', 'Doe', 4, 'Do Stuff'),
(16, 'Qint', 'Doe', 3, 'Accountant'),
(17, 'Ralf', 'Doe', 4, 'Do Stuff'),
(18, 'Saul', 'Doe', 4, 'Do Stuff'),
(19, 'Tony', 'Doe', 4, 'Do Stuff'),
(20, 'Vinn', 'Doe', 4, 'Do Stuff');
GO
CREATE TABLE dbo.WorkAssignment (
WorkAssignmentID INT IDENTITY(1,1) NOT NULL
CONSTRAINT pk_WorkAssignment
PRIMARY KEY CLUSTERED,
WorkOrder INT NOT NULL,
AssignedTo INT NOT NULL
CONSTRAINT fk_WorkAssignment_AssignedTo
FOREIGN KEY REFERENCES dbo.Employee(EmployeeID)
);
GO
--===================================================================
/* This is where the actual solution begins... */
/*
Blindly assigning work orders in “round-robin” order is pretty simple but probably not applicable to the real world.
It seems unlikely that all employees who will be assigned work will all have their EmployeeIDs assigned 1 - N without any gaps…
If new work assignments were to start at 1 every time, the employees with low number IDs would end up being assigned more work
than those with the highest IDs… and… if “assignment batches” tend to be smaller than the employee count, employees with
high ID numbers may never get any work assigned to them.
This solution deals with both potential problems by putting the “assignable” employees into a #EmployAssignmentOrder table where
the AssignmentOrder guarantees a clean unbroken sequence of numbers, no matter the actual EmployeeID values, and picks up
where the last assignment left off.
*/
IF OBJECT_ID('tempdb..#EmployAssignmentOrder', 'U') IS NOT NULL
DROP TABLE #EmployAssignmentOrder;
CREATE TABLE #EmployAssignmentOrder (
EmployeeID INT NOT NULL,
AssignmentOrder INT NOT NULL
);
DECLARE
#LastAssignedTo INT = ISNULL((SELECT TOP (1) wa.AssignedTo FROM dbo.WorkAssignment wa ORDER BY wa.WorkAssignmentID DESC), 0),
#AssignableEmpCount INT = 0;
INSERT #EmployAssignmentOrder (EmployeeID, AssignmentOrder)
SELECT
e.EmployeeID,
AssignmentOrder = ROW_NUMBER() OVER (ORDER BY CASE WHEN e.EmployeeID <= #LastAssignedTo THEN e.EmployeeID * 1000 ELSE e.EmployeeID END )
FROM
dbo.Employee e
WHERE
e.DepartmentID = 4
AND e.PrimaryJobTitle = 'Do Stuff';
SET #AssignableEmpCount = ##ROWCOUNT;
ALTER TABLE #EmployAssignmentOrder ADD PRIMARY KEY CLUSTERED (AssignmentOrder);
/* Using an "inline tally" to generate new work orders...
This won’t be part of you final working solution BUT you should recognize the fact that we are relying on the the fact that the
ROW_NUMBER() function is generating an ordered number sequence, 1 - N...You’ll need to generate a similar sequence in your
production solution as well.
*/
WITH
cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)),
cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),
cte_Tally (n) AS (
SELECT TOP (1999)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM
cte_n2 a CROSS JOIN cte_n2 b
)
INSERT dbo.WorkAssignment (WorkOrder, AssignedTo)
SELECT
WorkOrder = t.n + DATEDIFF(SECOND, '20180101', GETDATE()),
eao.EmployeeID
FROM
cte_Tally t
JOIN #EmployAssignmentOrder eao
ON ISNULL(NULLIF(t.n % #AssignableEmpCount, 0), #AssignableEmpCount) = eao.AssignmentOrder;
-- Check the newly inserted values...
SELECT
*
FROM
dbo.WorkAssignment wa
ORDER BY
wa.WorkAssignmentID;
--===================================================================
-- cleanup...
/*
DROP TABLE dbo.WorkAssignment;
DROP TABLE dbo.Employee;
DROP TABLE #EmployAssignmentOrder;
*/

SQL Server CTE left outer join

I have 2 tables in SQL Server 2008, customertest with columns customer id (cid) and it's boss id (upid), and conftest with cid, confname, confvalue
customertest schema and data:
conftest schema and data:
I want to know how to design a CTE that if cid in conftest doesn't have that confname's confvalue, it will keep searching upid and till find a upper line which have confname and confvalue.
For example , I want to get value of 100 if I search for cid=4 (this is normal case). And I want to get value of 200 if I search for cid=7 or 8.
And if cid7 and cid8 have child node , it will all return 200 (of cid5) if I search using this CTE.
I don't have a clue how to do this , I think maybe can use CTE and some left outer join, please give me some example ?? Thanks a lot.
If it's unknown how many levels there are in the hierarchy?
Then such challenge is often done via a Recursive CTE.
Example Snippet:
--
-- Using table variables for testing reasons
--
declare #customertest table (cid int primary key, upid int);
declare #conftest table (cid int, confname varchar(6) default 'budget', confvalue int);
--
-- Sample data
--
insert into #customertest (cid, upid) values
(1,0), (2,1), (3,1), (4,2), (5,2), (6,3),
(7,5), (8,5), (9,8), (10,9);
insert into #conftest (cid, confvalue) values
(1,1000), (2,700), (3,300), (4,100), (5,200), (6,300);
-- The customer that has his own budget, or not.
declare #customerID int = 10;
;with RCTE AS
(
--
-- the recursive CTE starts from here. The seed records, as one could call it.
--
select cup.cid as orig_cid, 0 as lvl, cup.cid, cup.upid, budget.confvalue
from #customertest as cup
left join #conftest budget on (budget.cid = cup.cid and budget.confname = 'budget')
where cup.cid = #customerID -- This is where we limit on the customer
union all
--
-- This is where the Recursive CTE loops till it finds nothing new
--
select RCTE.orig_cid, RCTE.lvl+1, cup.cid, cup.upid, budget.confvalue
from RCTE
join #customertest as cup on (cup.cid = RCTE.upid)
outer apply (select b.confvalue from #conftest b where b.cid = cup.cid and b.confname = 'budget') as budget
where RCTE.confvalue is null -- Loop till a budget is found
)
select
orig_cid as cid,
confvalue
from RCTE
where confvalue is not null;
Result :
cid confvalue
--- ---------
10 200
Btw, the Recursive CTE uses the OUTER APPLY because MS SQL Server doesn't allow a LEFT OUTER JOIN to be used there.
And if it's certain that there's maximum 1 level depth for the upid with a budget?
Then just simple left joins and a coalesce would do.
For example:
select cup.cid, coalesce(cBudget.confvalue, upBudget.confvalue) as confvalue
from #customertest as cup
left join #conftest cBudget on (cBudget.cid = cup.cid and cBudget.confname = 'budget')
left join #conftest upBudget on (upBudget.cid = cup.upid and upBudget.confname = 'budget')
where cup.cid = 8;
I don't think you are looking for a CTE to do that, from what I understand:
CREATE TABLE CustomerTest(
CID INT,
UPID INT
);
CREATE TABLE ConfTest(
CID INT,
ConfName VARCHAR(45),
ConfValue INT
);
INSERT INTO CustomerTest VALUES
(1, 0),
(2, 1),
(3, 1),
(4, 2),
(5, 2),
(6, 3),
(7, 5),
(8, 5);
INSERT INTO ConfTest VALUES
(1, 'Budget', 1000),
(2, 'Budget', 700),
(3, 'Budget', 300),
(4, 'Budget', 100),
(5, 'Budget', 200),
(6, 'Budget', 300);
SELECT MAX(CNT.CID) AS CID,
CNT.ConfName,
MIN(CNT.ConfValue) AS ConfValue
FROM ConfTest CNT INNER JOIN CustomerTest CMT ON CMT.CID = CNT.CID
OR CMT.UPID = CNT.CID
WHERE CMT.CID = 7 -- You can test for values (8, 4) or any value you want :)
GROUP BY
CNT.ConfName;

Suggestions for improving slow performance of subquery

I've tried to illustrate the problem in the (made-up) example below. Essentially, I want to filter records in the primary table based on content in a secondary table. When I attempted this using subqueries, our application performance took a big hit (some queries nearly 10x slower).
In this example I want to return all case notes for a customer EXCEPT for the ones that have references to products 1111 and 2222 in the detail table:
select cn.id, cn.summary from case_notes cn
where customer_id = 2
and exists (
select 1 from case_note_details cnd
where cnd.case_note_id = cn.id
and cnd.product_id not in (1111,2222)
)
I tried using a join as well:
select distinct cn.id, cn.summary from case_notes cn
join case_note_details cnd
on cnd.case_note_id = cn.id
and cnd.product_id not in (1111,2222)
where customer_id = 2
In both cases the execution plan shows two clustered index scans. Any suggestions for other methods or tweaks to improve performance?
Schema:
CREATE TABLE case_notes
(
id int primary key,
employee_id int,
customer_id int,
order_id int,
summary varchar(50)
);
CREATE TABLE case_note_details
(
id int primary key,
case_note_id int,
product_id int,
detail varchar(1024)
);
Sample data:
INSERT INTO case_notes
(id, employee_id, customer_id, order_id, summary)
VALUES
(1, 1, 2, 1000, 'complaint1'),
(2, 1, 2, 1001, 'complaint2'),
(3, 1, 2, 1002, 'complaint3'),
(4, 1, 2, 1003, 'complaint4');
INSERT INTO case_note_details
(id, case_note_id, product_id, detail)
VALUES
(1, 1, 1111, 'Note 1, order 1000, complaint about product 1111'),
(2, 1, 2222, 'Note 1, order 1000, complaint about product 2222'),
(3, 2, 1111, 'Note 2, order 1001, complaint about product 1111'),
(4, 2, 2222, 'Note 2, order 1001, complaint about product 2222'),
(5, 3, 3333, 'Note 3, order 1002, complaint about product 3333'),
(6, 3, 4444, 'Note 3, order 1002, complaint about product 4444'),
(7, 4, 5555, 'Note 4, order 1003, complaint about product 5555'),
(8, 4, 6666, 'Note 4, order 1003, complaint about product 6666');
You have a clustered index scan because you are not accessing your case_note_details table by its id but via non-indexed columns.
I suggest adding an index to the case-note_details table on case_note_id, product_id.
If you are always accessing the case_note_details via the case_note_id, you might also restructure your primary key to be case_note_id, detail_id. There is no need for an independent id as primary key for dependent records. This would let you re-use your detail primary key index for joins with the header table.
Edit: add an index on customer_id as well to the case_notes table, as Manuel Rocha suggested.
When using "exists" I always limit results with "TOP" as bellow:
select cn.id
,cn.summary
from case_notes as cn
where customer_id = 2
and exists (
select TOP 1 1
from case_note_details as cnd
where cnd.case_note_id = cn.id
and cnd.product_id not in (1111,2222)
)
In table case_notes create index for customer_id and on table case_note_details create index for case_note_id and case_note_id.
Then try execute both query. Should have better performance now.
Try also this query
select
cn.id,
cn.summary
from
case_notes cn
where
cn.customer_id = 2 and
cn.id in
(
select
distinct cnd.case_note_id
from
case_note_details cnd
where
cnd.product_id not in (1111,2222)
)
Did you try "in" instead of "exists". This sometimes performs differently:
select cn.id, cn.summary from case_notes cn
where customer_id = 2
and cn.id in (
select cnd.case_note_id from case_note_details cnd
where cnd.product_id not in (1111,2222)
)
Of course, check indexes.

Selecting only records where a certain data does not exist

I'm trying to come up with a query which excludes certain records that have a specific value.
Here's a snippet of my code:
CREATE TABLE #myMenu
([Id] int, [dish] varchar(100), [dishtype] varchar(10), [amount] int, [ingredient] varchar(10))
;
INSERT INTO #myMenu
([Id], [dish], [dishtype], [amount], [ingredient])
VALUES
(1, 'salad', 'appetizer', 1, 'nuts'),
(1, 'salad', 'appetizer', 1, 'lettuce'),
(2, 'chicken cashew nuts', 'main', 2, 'chicken'),
(2, 'chicken cashew nuts', 'main', 9, 'nuts'),
(3, 'chicken marsala', 'main', 0, 'chicken'),
(3, 'chicken marsala', 'main', 0, 'pepper'),
(4, 'roast pork macadamia', 'main', 2, 'nuts'),
(4, 'roast pork macadamia', 'main', 2, 'pork')
;
Now what I want to do is to select all dishes that don't have nuts. Which should only have:
(3, 'chicken marsala', 'main'
The code is below but the table you provided need to be normalized and split it into more that one table.
select [Id],[dish],[dishtype]
from #myMenu
group by [Id],[dish],[dishtype]
having sum(Case When ingredient='nuts' Then 1 Else 0 End)=0
select M.Id, M.Dish, M.DishType
from #myMenu as M inner join
( select Id, Sum( case when Ingredient = 'nuts' then 1 end ) as Nutty from #MyMenu group by Id ) as Nuts
on Nuts.Id = M.Id and Nuts.Nutty is NULL
group by M.Id, M.dish, M.dishtype
or:
select distinct M.Id, M.Dish, M.DishType
from #myMenu as M inner join
( select Id, Sum( case when Ingredient = 'nuts' then 1 end ) as Nutty from #MyMenu group by Id ) as Nuts
on Nuts.Id = M.Id and Nuts.Nutty is NULL
Select *
FROM myMenu
WHERE ingredient != 'nuts' AND
dish NOT LIKE '%macadamia%' AND
dish NOT LIKE '%cashew%'
If you wanted to only include main dishes you can just add AND dishType = 'main'

Resources