Finding the topmost item in a denormalized hierarchy - sql-server

I'm working with a set of tables that establishes a denormalized location structure with the following schema:
Location (Name,Id)
Sublocation1 (Name,Id,LocationId)
Sublocation2 (Name,Id,Sublocation1Id)
Sublocation3 (Name,Id,Sublocation2Id)
And a table that tracks the association between a user and each level:
UserLocation (User,LocationId,Sublocation1Id,Sublocation2Id,Sublocation3Id)
Access to a higher level location grants access to any level under it, so the second row in the following example is superfluous, but the third row is not:
User Location Sublocation1 Sublocation2 Sublocation3
----------------------------------------------------------------
Joe Houston Plant West Building NULL NULL
Joe Houston Plant West Building Third Floor Room 42
Joe Houston Plant East Building Second Floor Room 21
The third row grants Joe access to just room 21, but not other Sublocation3s under Second Floor
Question: How can I find all the records that grant the highest level of access without granting additional permissions to Joe? My goal is to be able to trim all these extraneous entries out of my database.
I can see a number of ways to solve this, but nothing that I've been able to translate into set-based logic to make a good query.

Given the following table:
CREATE TABLE UserLocation(
UserId int,
LocationId int,
Sublocation1Id int,
Sublocation2Id int,
Sublocation3Id int
)
The following query gives you the highest level accesses:
-- Highest level access:
SELECT * -- Level 3 grants with no higher level grants
FROM UserLocation UL
WHERE
UL.Sublocation3Id IS NOT NULL
AND NOT EXISTS (
SELECT * -- higher level grant
FROM UserLocation UL1
WHERE
UL1.Sublocation3Id IS NULL
AND (UL1.Sublocation2Id IS NULL OR UL1.Sublocation2Id = UL.Sublocation2Id)
AND (UL1.Sublocation1Id IS NULL OR UL1.Sublocation1Id = UL.Sublocation1Id)
AND (UL1.LocationId = UL.LocationId)
)
UNION ALL
SELECT * -- Level 2 grants with no higher level grants
FROM UserLocation UL
WHERE
UL.Sublocation3Id IS NULL
AND UL.Sublocation2Id IS NOT NULL
AND NOT EXISTS (
SELECT * -- higher level grant
FROM UserLocation UL1
WHERE
UL1.Sublocation3Id IS NULL
AND UL1.Sublocation2Id IS NULL
AND (UL1.Sublocation1Id IS NULL OR UL1.Sublocation1Id = UL.Sublocation1Id)
AND (UL1.LocationId = UL.LocationId)
)
UNION ALL
SELECT * -- Level 1 grants with no higher level grants
FROM UserLocation UL
WHERE
UL.Sublocation3Id IS NULL
AND UL.Sublocation2Id IS NULL
AND UL.Sublocation1Id IS NOT NULL
AND NOT EXISTS (
SELECT * -- higher level grant
FROM UserLocation UL1
WHERE
UL1.Sublocation3Id IS NULL
AND UL1.Sublocation2Id IS NULL
AND UL1.Sublocation1Id IS NULL
AND (UL1.LocationId = UL.LocationId)
)
SELECT * -- Level 0 grants
FROM UserLocation UL
WHERE
UL.Sublocation3Id IS NULL
AND UL.Sublocation2Id IS NULL
AND UL.Sublocation1Id IS NULL
The following query shows you the superflous grants:
-- Superflous grants (there is higher level grants)
SELECT * -- Level 3 grants with higher level grants
FROM UserLocation UL
WHERE
UL.Sublocation3Id IS NOT NULL
AND EXISTS (
SELECT * -- higher level grant
FROM UserLocation UL1
WHERE
UL1.Sublocation3Id IS NULL
AND (UL1.Sublocation2Id IS NULL OR UL1.Sublocation2Id = UL.Sublocation2Id)
AND (UL1.Sublocation1Id IS NULL OR UL1.Sublocation1Id = UL.Sublocation1Id)
AND (UL1.LocationId = UL.LocationId)
)
UNION ALL
SELECT * -- Level 2 grants with higher level grants
FROM UserLocation UL
WHERE
UL.Sublocation3Id IS NULL
AND UL.Sublocation2Id IS NOT NULL
AND EXISTS (
SELECT * -- higher level grant
FROM UserLocation UL1
WHERE
UL1.Sublocation3Id IS NULL
AND UL1.Sublocation2Id IS NULL
AND (UL1.Sublocation1Id IS NULL OR UL1.Sublocation1Id = UL.Sublocation1Id)
AND (UL1.LocationId = UL.LocationId)
)
UNION ALL
SELECT * -- Level 1 grants with higher level grants
FROM UserLocation UL
WHERE
UL.Sublocation3Id IS NULL
AND UL.Sublocation2Id IS NULL
AND UL.Sublocation1Id IS NOT NULL
AND EXISTS (
SELECT * -- higher level grant
FROM UserLocation UL1
WHERE
UL1.Sublocation3Id IS NULL
AND UL1.Sublocation2Id IS NULL
AND UL1.Sublocation1Id IS NULL
AND (UL1.LocationId = UL.LocationId)
)
I assume that if the id of a location level L is null then the id of the location level L+1 is null.

Let's presume that UserLocation table has an Id column.
If it doesn't you can generate one using ROW_NUMBER.
In order to identify strong rules (the minimum set of existing access rules that is equivalent to all existing rules) or weak rules (a set of existing rules that are redundant because they are also covered by other "stronger" rules) you can join the UserLocation table with itself:
with access as (
select * from ( values
(1, 'Joe', 'Houston Plant', 'West Building', NULL, NULL ),
(2, 'Joe', 'Houston Plant', 'West Building', 'Third Floor', 'Room 42'),
(3, 'Joe', 'Houston Plant', 'East Building', 'Second Floor', 'Room 21'),
(4, 'Mark', 'Houston Plant', 'West Building', NULL, NULL ),
(5, 'Mark', 'Houston Plant', 'West Building', 'Third Floor', 'Room 42'),
(6, 'Mark', 'Houston Plant', 'West Building', 'Second Floor', 'Room 21'),
(7, 'Bob', null, null, NULL, NULL ),
(8, 'Bob', 'Houston Plant', 'West Building', 'Third Floor', 'Room 42'),
(9, 'Bob', 'Houston Plant', 'West Building', 'Second Floor', 'Room 21')
) as v(Id, Usr, Location1, Location2, Location3, Location4)
)
select distinct
weak.* -- or strong.*
from access strong
inner JOIN
access weak on weak.Usr = strong.Usr
and weak.Id <> strong.Id
and (strong.Location1 is null or weak.Location1 = strong.Location1)
and (strong.Location2 is null or weak.Location2 = strong.Location2)
and (strong.Location3 is null or weak.Location3 = strong.Location3)
and (strong.Location4 is null or weak.Location4 = strong.Location4)
Edit:
I figured out that there might be rules that are unique for user like:
(10, 'John', 'Houston Plant', 'West Building', 'Third Floor', 'Room 42'),
(11, 'Ana', 'Houston Plant', 'West Building', NULL, NULL )
Since the above query uses an INNER JOIN these rules are not reported in any set (weak or strong) but if you think about it you can consider that these rules are neutral so to speak because:
they do not override other rules
they are not overridden by other rules

Related

I have 10 employee and real time data is coming, while I need to insert each records equally distribution (Insert records)

SQL Server Query help: I have 10 employees and real time data is coming, while I need to Insert each records equally distribution.
Example:
Data=1 then EmployeeId=1, Next time Data=2 should be Insert to
EmployeeId=2 and this cycle will continue as per received raw data.
USE tempdb;
GO
/* Just setting up some tables to test with... */
CREATE TABLE dbo.Employee (
EmployeeID INT NOT NULL
CONSTRAINT pk_Employee
PRIMARY KEY CLUSTERED,
FirstName VARCHAR(20) NOT NULL,
LastName VARCHAR(20) NOT NULL,
DepartmentID TINYINT NOT NULL,
PrimaryJobTitle VARCHAR(20) NOT NULL
);
GO
/* Note: I'm assuming that only a certain subset of employees will be assigned "data".
In this case, those employees are in department 4, with a primary jobt itle of "Do Stuff"... */
INSERT dbo.Employee (EmployeeID, FirstName, LastName, DepartmentID, PrimaryJobTitle) VALUES
( 1, 'Jane', 'Doe', 1, 'CEO'),
( 2, 'Alex', 'Doe', 2, 'CIO'),
( 3, 'Bart', 'Doe', 3, 'CFO'),
( 4, 'Cami', 'Doe', 4, 'COO'),
( 5, 'Dolt', 'Doe', 3, 'Accountant'),
( 6, 'Elen', 'Doe', 4, 'Production Manager'),
( 7, 'Flip', 'Doe', 4, 'Do Stuff'),
( 8, 'Gary', 'Doe', 4, 'Do Stuff'),
( 9, 'Hary', 'Doe', 2, 'App Dev'),
(10, 'Jill', 'Doe', 4, 'Do Stuff'),
(11, 'Kent', 'Doe', 4, 'Do Stuff'),
(12, 'Lary', 'Doe', 4, 'Do Stuff'),
(13, 'Many', 'Doe', 4, 'Do Stuff'),
(14, 'Norm', 'Doe', 4, 'Do Stuff'),
(15, 'Paul', 'Doe', 4, 'Do Stuff'),
(16, 'Qint', 'Doe', 3, 'Accountant'),
(17, 'Ralf', 'Doe', 4, 'Do Stuff'),
(18, 'Saul', 'Doe', 4, 'Do Stuff'),
(19, 'Tony', 'Doe', 4, 'Do Stuff'),
(20, 'Vinn', 'Doe', 4, 'Do Stuff');
GO
CREATE TABLE dbo.WorkAssignment (
WorkAssignmentID INT IDENTITY(1,1) NOT NULL
CONSTRAINT pk_WorkAssignment
PRIMARY KEY CLUSTERED,
WorkOrder INT NOT NULL,
AssignedTo INT NOT NULL
CONSTRAINT fk_WorkAssignment_AssignedTo
FOREIGN KEY REFERENCES dbo.Employee(EmployeeID)
);
GO
--===================================================================
/* This is where the actual solution begins... */
/*
Blindly assigning work orders in “round-robin” order is pretty simple but probably not applicable to the real world.
It seems unlikely that all employees who will be assigned work will all have their EmployeeIDs assigned 1 - N without any gaps…
If new work assignments were to start at 1 every time, the employees with low number IDs would end up being assigned more work
than those with the highest IDs… and… if “assignment batches” tend to be smaller than the employee count, employees with
high ID numbers may never get any work assigned to them.
This solution deals with both potential problems by putting the “assignable” employees into a #EmployAssignmentOrder table where
the AssignmentOrder guarantees a clean unbroken sequence of numbers, no matter the actual EmployeeID values, and picks up
where the last assignment left off.
*/
IF OBJECT_ID('tempdb..#EmployAssignmentOrder', 'U') IS NOT NULL
DROP TABLE #EmployAssignmentOrder;
CREATE TABLE #EmployAssignmentOrder (
EmployeeID INT NOT NULL,
AssignmentOrder INT NOT NULL
);
DECLARE
#LastAssignedTo INT = ISNULL((SELECT TOP (1) wa.AssignedTo FROM dbo.WorkAssignment wa ORDER BY wa.WorkAssignmentID DESC), 0),
#AssignableEmpCount INT = 0;
INSERT #EmployAssignmentOrder (EmployeeID, AssignmentOrder)
SELECT
e.EmployeeID,
AssignmentOrder = ROW_NUMBER() OVER (ORDER BY CASE WHEN e.EmployeeID <= #LastAssignedTo THEN e.EmployeeID * 1000 ELSE e.EmployeeID END )
FROM
dbo.Employee e
WHERE
e.DepartmentID = 4
AND e.PrimaryJobTitle = 'Do Stuff';
SET #AssignableEmpCount = ##ROWCOUNT;
ALTER TABLE #EmployAssignmentOrder ADD PRIMARY KEY CLUSTERED (AssignmentOrder);
/* Using an "inline tally" to generate new work orders...
This won’t be part of you final working solution BUT you should recognize the fact that we are relying on the the fact that the
ROW_NUMBER() function is generating an ordered number sequence, 1 - N...You’ll need to generate a similar sequence in your
production solution as well.
*/
WITH
cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)),
cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),
cte_Tally (n) AS (
SELECT TOP (1999)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM
cte_n2 a CROSS JOIN cte_n2 b
)
INSERT dbo.WorkAssignment (WorkOrder, AssignedTo)
SELECT
WorkOrder = t.n + DATEDIFF(SECOND, '20180101', GETDATE()),
eao.EmployeeID
FROM
cte_Tally t
JOIN #EmployAssignmentOrder eao
ON ISNULL(NULLIF(t.n % #AssignableEmpCount, 0), #AssignableEmpCount) = eao.AssignmentOrder;
-- Check the newly inserted values...
SELECT
*
FROM
dbo.WorkAssignment wa
ORDER BY
wa.WorkAssignmentID;
--===================================================================
-- cleanup...
/*
DROP TABLE dbo.WorkAssignment;
DROP TABLE dbo.Employee;
DROP TABLE #EmployAssignmentOrder;
*/

T-SQL prepare dynamic COALESCE

As attached in screenshot, there are two tables.
Configuration:
Detail
Using Configuration and Detail table I would like to populate IdentificationType and IDerivedIdentification column in the Detail table.
Following logic should be used, while deriving above columns
Configuration table has order of preference, which user can change dynamically (i.e. if country is Austria then ID preference should be LEI then TIN (in case LEI is blanks) then CONCAT (if both blank then some other logic)
In case of contract ID = 3, country is BG, so LEI should be checked first, since its NULL, CCPT = 456 will be picked.
I could have used COALESCE and CASE statement, in case hardcoding is allowed.
Can you please suggest any alternation approach please ?
Regards
Digant
Assuming that this is some horrendous data dump and you are trying to clean it up here is some SQL to throw at it. :) Firstly, I was able to capture your image text via Adobe Acrobat > Excel.
(I also built the schema for you at: http://sqlfiddle.com/#!6/8f404/12)
Firstly, the correct thing to do is fix the glaring problem and that's the table structure. Assuming you can't here's a solution.
So, here it is and what it does is unpivots the columns LEI, NIND, CCPT and TIN from the detail table and also as well as FirstPref, SecondPref, ThirdPref from the Configuration table. Basically, doing this helps to normalize the data although it's costing you major performance if there are no plans to fix the data structure or you cannot. After that you are simply joining the tables Detail.ContactId to DerivedTypes.ContactId then DerivedPrefs.ISOCountryCode to Detail.CountrylSOCountryCode and DerivedTypes.ldentificationType = DerivedPrefs.ldentificationType If you use an inner join rather than the left join you can remove the RANK() function but it will not show all ContactIds, only those that have a value in their LEI, NIND, CCPT or TIN columns. I think that's a better solution anyway because why would you want to see an error mixed in a report? Write a separate report for those with no values in those columns. Lastly, the TOP (1) with ties allows you to display one record per ContactId and allows for the record with the error to still display. Hope this helps.
CREATE TABLE Configuration
(ISOCountryCode varchar(2), CountryName varchar(8), FirstPref varchar(6), SecondPref varchar(6), ThirdPref varchar(6))
;
INSERT INTO Configuration
(ISOCountryCode, CountryName, FirstPref, SecondPref, ThirdPref)
VALUES
('AT', 'Austria', 'LEI', 'TIN', 'CONCAT'),
('BE', 'Belgium', 'LEI', 'NIND', 'CONCAT'),
('BG', 'Bulgaria', 'LEI', 'CCPT', 'CONCAT'),
('CY', 'Cyprus', 'LEI', 'NIND', 'CONCAT')
;
CREATE TABLE Detail
(ContactId int, FirstName varchar(1), LastName varchar(3), BirthDate varchar(4), CountrylSOCountryCode varchar(2), Nationality varchar(2), LEI varchar(9), NIND varchar(9), CCPT varchar(9), TIN varchar(9))
;
INSERT INTO Detail
(ContactId, FirstName, LastName, BirthDate, CountrylSOCountryCode, Nationality, LEI, NIND, CCPT, TIN)
VALUES
(1, 'A', 'DES', NULL, 'AT', 'AT', '123', '4345', NULL, NULL),
(2, 'B', 'DEG', NULL, 'BE', 'BE', NULL, '890', NULL, NULL),
(3, 'C', 'DEH', NULL, 'BG', 'BG', NULL, '123', '456', NULL),
(4, 'D', 'DEi', NULL, 'BG', 'BG', NULL, NULL, NULL, NULL)
;
SELECT TOP (1) with ties Detail.ContactId,
FirstName,
LastName,
BirthDate,
CountrylSOCountryCode,
Nationality,
LEI,
NIND,
CCPT,
TIN,
ISNULL(DerivedPrefs.ldentificationType, 'ERROR') ldentificationType,
IDerivedIdentification,
RANK() OVER (PARTITION BY Detail.ContactId ORDER BY
CASE WHEN Pref = 'FirstPref' THEN 1
WHEN Pref = 'SecondPref' THEN 2
WHEN Pref = 'ThirdPref' THEN 3
ELSE 99 END) AS PrefRank
FROM
Detail
LEFT JOIN
(
SELECT
ContactId,
LEI,
NIND,
CCPT,
TIN
FROM Detail
) DetailUNPVT
UNPIVOT
(IDerivedIdentification FOR ldentificationType IN
(LEI, NIND, CCPT, TIN)
)AS DerivedTypes
ON DerivedTypes.ContactId = Detail.ContactId
LEFT JOIN
(
SELECT
ISOCountryCode,
CountryName,
FirstPref,
SecondPref,
ThirdPref
FROM
Configuration
) ConfigUNPIVOT
UNPIVOT
(ldentificationType FOR Pref IN
(FirstPref, SecondPref, ThirdPref)
)AS DerivedPrefs
ON DerivedPrefs.ISOCountryCode = Detail.CountrylSOCountryCode
and DerivedTypes.ldentificationType = DerivedPrefs.ldentificationType
ORDER BY RANK() OVER (PARTITION BY Detail.ContactId ORDER BY
CASE WHEN Pref = 'FirstPref' THEN 1
WHEN Pref = 'SecondPref' THEN 2
WHEN Pref = 'ThirdPref' THEN 3
ELSE 99 END)

The maximum recursion 100 has been exhausted before statement completion (SQL Server)

In SQL Server, I have this simplified table and I'm trying to get a list of all employees with their domain manager:
IF OBJECT_ID('tempdb.dbo.#employees') IS NOT NULL DROP TABLE #employees
CREATE TABLE #employees (
empid int,
empname varchar(50),
mgrid int,
func varchar(50)
)
INSERT INTO #employees VALUES(1, 'Jeff', 2, 'Designer')
INSERT INTO #employees VALUES(2, 'Luke', 4, 'Head of designers')
INSERT INTO #employees VALUES(3, 'Vera', 2, 'Designer')
INSERT INTO #employees VALUES(4, 'Peter', 5, 'Domain Manager')
INSERT INTO #employees VALUES(5, 'Olivia', NULL, 'CEO')
;
WITH Emp_CTE AS (
SELECT empid, empname, func, mgrid AS dommgr
FROM #employees
UNION ALL
SELECT e.empid, e.empname, e.func, e.mgrid AS dommgr
FROM #employees e
INNER JOIN Emp_CTE ecte ON ecte.empid = e.mgrid
WHERE ecte.func <> 'Domain Manager'
)
SELECT * FROM Emp_CTE
So the output I want is:
empid empname func dommgr
1 Jeff Designer 4
2 Luke Head of designers 4
3 Vera Designer 4
Instead I get this error:
Msg 530, Level 16, State 1, Line 17
The statement terminated. The maximum recursion 100 has been exhausted before statement completion.
What am I doing wrong? Is it actually possible with CTE?
Edit: There was indeed an error in the data, the error has gone now, but the result isn't what I want:
empid empname func dommgr
1 Jeff Designer 2
2 Luke Head of designers 4
3 Vera Designer 2
4 Peter Domain Manager 5
5 Olivia CEO NULL
4 Peter Domain Manager 5
1 Jeff Designer 2
3 Vera Designer 2
You had two employees which were referenecing each other in the managerid, so one was the manager of the other. That caused the infinite recursion. There was also a gap in the recursion tree because the domain-manager was not referenced anywhere. You have fixed the sample data by changing Luke`s mgrid to 4. Now there is no gap and no lgical issue anymore.
But you also had no root entry for the recursion, the first query has no filter.
You can use this query:
WITH DomainManager AS (
SELECT empid, empname, func, dommgr = empid, Hyrarchy = 1
FROM #employees
WHERE func = 'Domain Manager'
UNION ALL
SELECT e.empid, e.empname, e.func, dommgr, Hyrarchy = Hyrarchy +1
FROM #employees e
INNER JOIN DomainManager dm ON dm.empid = e.mgrid
)
SELECT * FROM DomainManager
WHERE func <> 'Domain Manager'
ORDER BY empid
Note that the enry/root point for the CTE is the Domain Manager because you want to find every employees domain manager's ids. This id is transported down the hyrarchy. The final select needs to filter out the Domain Manager because you only want his ID for every employee, you dont want to include him in the result set.
The result of the query is:
empid empname func dommgr Hyrarchy
1 Jeff Designer 4 3
2 Luke Head of designers 4 2
3 Vera Designer 4 3
The error message is raised because the data contains a circular reference between Luke and Vera.
It's easier to perform hierarchical queries if you add a hierarchyid field. SQL Server provides functions that return descendants, ancestors and the level in a hierarchy. hierarchyid fields can be indexed resulting in improved performance.
In the employee example, you can add a level field :
declare #employees table (
empid int PRIMARY KEY,
empname varchar(50),
mgrid int,
func varchar(50),
level hierarchyid not null,
INDEX IX_Level (level)
)
INSERT INTO #employees VALUES
(1, 'Jeff', 2, 'Designer' ,'/5/4/2/1/'),
(2, 'Luke', 4, 'Head of designers','/5/4/2/'),
(3, 'Vera', 2, 'Designer' ,'/5/4/2/3/'),
(4, 'Peter', 5, 'Domain Manager' ,'/5/4/'),
(5, 'Olivia', NULL, 'CEO' ,'/5/')
;
` declare #employees table (
empid int PRIMARY KEY,
empname varchar(50),
mgrid int,
func varchar(50),
level hierarchyid not null,
INDEX IX_Level (level)
)
INSERT INTO #employees VALUES
(1, 'Jeff', 2, 'Designer' ,'/5/4/2/1/'),
(2, 'Luke', 4, 'Head of designers','/5/4/2/'),
(3, 'Vera', 2, 'Designer' ,'/5/4/2/3/'),
(4, 'Peter', 5, 'Domain Manager' ,'/5/4/'),
(5, 'Olivia', NULL, 'CEO' ,'/5/')
;
/5/4/2/1/ is the string representation of a hieararchyID value. It's essentially the path in the hierarchy that leads to a particular row.
To find all subordinates of domain managers, excluding the managers themselves, you can write :
with DMs as
(
select EmpID,level
from #employees
where func='Domain Manager'
)
select
PCs.empid,
PCs.empname as Name,
PCs.func as Class,
DMs.empid as DM,
PCs.level.GetLevel() as THAC0,
PCs.level.GetLevel()- DMs.level.GetLevel() as NextLevel
from
#employees PCs
inner join DMs on PCs.level.IsDescendantOf(DMs.level)=1
where DMs.EmpID<>PCs.empid;
The CTE is only used for convenience
The result is :
empid Name Class DM THAC0 NextLevel
1 Jeff Designer 4 4 2
2 Luke Head of designers 4 3 1
3 Vera Designer 4 4 2
The CTE returns all DMs and their hierarchyid value. The IsDescendantOf() query checks whether a row is a descentant of a DM or not. GetLevel() returns the level of the row in the hierarchy. By subtracting the DM's level from the employee's we get the distance between them
Like others said, you have here a problem with data (Vera).
IF OBJECT_ID('tempdb.dbo.#employees') IS NOT NULL
DROP TABLE #employees
CREATE TABLE #employees (
empid int,
empname varchar(50),
mgrid int,
func varchar(50)
)
INSERT INTO #employees VALUES(1, 'Jeff', 2, 'Designer')
INSERT INTO #employees VALUES(2, 'Luke', 3, 'Head of designers')
INSERT INTO #employees VALUES(3, 'Vera', 4, 'Designer') --**mgrid = 4 instead 2**
INSERT INTO #employees VALUES(4, 'Peter', 5, 'Domain Manager')
INSERT INTO #employees VALUES(5, 'Olivia', NULL, 'CEO')
;WITH Emp_CTE AS
(
SELECT empid, empname, func, mgrid AS dommgr, 0 AS Done
FROM #employees
UNION ALL
SELECT ecte.empid, ecte.empname, ecte.func,
CASE WHEN e.func = 'Domain Manager' THEN e.empid ELSE e.mgrid END AS dommgr,
CASE WHEN e.func = 'Domain Manager' THEN 1 ELSE 0 END AS Done
FROM Emp_CTE AS ecte
INNER JOIN #employees AS e ON
ecte.dommgr = e.empid
WHERE ecte.Done = 0--emp.func <> 'Domain Manager'
)
SELECT *
FROM Emp_CTE
WHERE Done = 1

how to conditionally update table

I have one address table (containing k_id-PK, address) and one add_hist log table(containing k_id, address,change date) i.e. it has all address per id and on which date address change.
I want to make an update query which will update address column in address table so,fetching latest address from add_hist table will do the job.I am almost done with my query. Its fetching correct result too. But I want if address table is already updated, then dont update it.Here goes my query.Please review and correct it to get the desired result.
update address a set k_add =
(select kad from (
select h.k_id kid, h.k_add kad, h.chg_dt from add_hist h,
(select k_id, max(chg_dt) ch from add_hist
group by k_id
) h1
where h1.k_id = h.k_id
and h1.ch=h.chg_dt
) h2
where h2.kid = a.k_id)
;
You could use a merge instead of an update:
merge into address a
using (
select k_id, max(k_add) keep (dense_rank last order by chg_dt) as k_add
from add_hist
group by k_id
) h
on (a.k_id = h.k_id)
when matched then
update set a.k_add = h.k_add
where (a.k_add is null and h.k_add is not null)
or (a.k_add is not null and h.k_add is null)
or a.k_add != h.k_add;
The query in the using clause finds the most recent address for each ID from the history table. When a matching ID exists on the main table that is updated - but only if the value is different, because of the where clause.
With some dummy data:
create table address (k_id number primary key, k_add varchar2(20));
create table add_hist (k_id number, k_add varchar2(20), chg_dt date);
insert into address (k_id, k_add) values (1, 'Address 1');
insert into address (k_id, k_add) values (2, 'Address 2');
insert into address (k_id, k_add) values (3, null);
insert into address (k_id, k_add) values (4, null);
insert into add_hist (k_id, k_add, chg_dt) values (1, 'Address 1', date '2017-01-01');
insert into add_hist (k_id, k_add, chg_dt) values (1, 'Address 2', date '2017-01-02');
insert into add_hist (k_id, k_add, chg_dt) values (1, 'Address 1', date '2017-01-03');
insert into add_hist (k_id, k_add, chg_dt) values (2, 'Address 1', date '2017-01-01');
insert into add_hist (k_id, k_add, chg_dt) values (2, 'Address 2', date '2017-01-02');
insert into add_hist (k_id, k_add, chg_dt) values (2, 'Address 3', date '2017-01-03');
insert into add_hist (k_id, k_add, chg_dt) values (3, 'Address 1', date '2017-01-01');
insert into add_hist (k_id, k_add, chg_dt) values (3, null, date '2017-01-02');
insert into add_hist (k_id, k_add, chg_dt) values (4, 'Address 1', date '2017-01-01');
commit;
running your update statement gets:
4 rows updated.
select * from address;
K_ID K_ADD
---------- --------------------
1 Address 1
2 Address 3
3
4 Address 1
After rolling back to the starting state, running the merge gets:
2 rows merged.
select * from address;
K_ID K_ADD
---------- --------------------
1 Address 1
2 Address 3
3
4 Address 1
Same final result, but 1 row merged rather than 2 rows updated.
(If you run the merge without the where clause, all four rows are still affected; without the null checks only row with ID 2 is updated).
You can achieve the desired result with an UPDATE statement. Specifically, you need to "update through a join." The syntax has to be precise though. Update with joins
Using the same setup as in Alex's answer, the following update statement will update one row.
EDIT: See Alex Poole's comments below this Answer. The solution proposed here will work only in Oracle 12.1 and above. The problem is not the "update through join" concept, but the source rowset being the result of an aggregation. It has to do with the way in which Oracle knows, at compile time, that the "join" column in the source rowset is unique (it has no duplicates). In older versions of Oracle, an explicit unique or primary key constraint or index was required. Of course, when we GROUP BY <col>, the <col> will be unique in the result set of an aggregation, but it will not have a unique constraint or index on it. It seems Oracle recognized this situation, and since 12.1 it allows update through join where the source table is the result of an aggregation, as shown in this example.
update
( select a.k_add as current_address, q.new_address
from (
select k_id,
min(k_add) keep (dense_rank last order by chg_dt) as new_address
from add_hist
group by k_id
) q
join
address a on a.k_id = q.k_id
)
set current_address = new_address
where current_address != new_address
or current_address is null and new_address is not null
or current_address is not null and new_address is null
;

Suggestions for improving slow performance of subquery

I've tried to illustrate the problem in the (made-up) example below. Essentially, I want to filter records in the primary table based on content in a secondary table. When I attempted this using subqueries, our application performance took a big hit (some queries nearly 10x slower).
In this example I want to return all case notes for a customer EXCEPT for the ones that have references to products 1111 and 2222 in the detail table:
select cn.id, cn.summary from case_notes cn
where customer_id = 2
and exists (
select 1 from case_note_details cnd
where cnd.case_note_id = cn.id
and cnd.product_id not in (1111,2222)
)
I tried using a join as well:
select distinct cn.id, cn.summary from case_notes cn
join case_note_details cnd
on cnd.case_note_id = cn.id
and cnd.product_id not in (1111,2222)
where customer_id = 2
In both cases the execution plan shows two clustered index scans. Any suggestions for other methods or tweaks to improve performance?
Schema:
CREATE TABLE case_notes
(
id int primary key,
employee_id int,
customer_id int,
order_id int,
summary varchar(50)
);
CREATE TABLE case_note_details
(
id int primary key,
case_note_id int,
product_id int,
detail varchar(1024)
);
Sample data:
INSERT INTO case_notes
(id, employee_id, customer_id, order_id, summary)
VALUES
(1, 1, 2, 1000, 'complaint1'),
(2, 1, 2, 1001, 'complaint2'),
(3, 1, 2, 1002, 'complaint3'),
(4, 1, 2, 1003, 'complaint4');
INSERT INTO case_note_details
(id, case_note_id, product_id, detail)
VALUES
(1, 1, 1111, 'Note 1, order 1000, complaint about product 1111'),
(2, 1, 2222, 'Note 1, order 1000, complaint about product 2222'),
(3, 2, 1111, 'Note 2, order 1001, complaint about product 1111'),
(4, 2, 2222, 'Note 2, order 1001, complaint about product 2222'),
(5, 3, 3333, 'Note 3, order 1002, complaint about product 3333'),
(6, 3, 4444, 'Note 3, order 1002, complaint about product 4444'),
(7, 4, 5555, 'Note 4, order 1003, complaint about product 5555'),
(8, 4, 6666, 'Note 4, order 1003, complaint about product 6666');
You have a clustered index scan because you are not accessing your case_note_details table by its id but via non-indexed columns.
I suggest adding an index to the case-note_details table on case_note_id, product_id.
If you are always accessing the case_note_details via the case_note_id, you might also restructure your primary key to be case_note_id, detail_id. There is no need for an independent id as primary key for dependent records. This would let you re-use your detail primary key index for joins with the header table.
Edit: add an index on customer_id as well to the case_notes table, as Manuel Rocha suggested.
When using "exists" I always limit results with "TOP" as bellow:
select cn.id
,cn.summary
from case_notes as cn
where customer_id = 2
and exists (
select TOP 1 1
from case_note_details as cnd
where cnd.case_note_id = cn.id
and cnd.product_id not in (1111,2222)
)
In table case_notes create index for customer_id and on table case_note_details create index for case_note_id and case_note_id.
Then try execute both query. Should have better performance now.
Try also this query
select
cn.id,
cn.summary
from
case_notes cn
where
cn.customer_id = 2 and
cn.id in
(
select
distinct cnd.case_note_id
from
case_note_details cnd
where
cnd.product_id not in (1111,2222)
)
Did you try "in" instead of "exists". This sometimes performs differently:
select cn.id, cn.summary from case_notes cn
where customer_id = 2
and cn.id in (
select cnd.case_note_id from case_note_details cnd
where cnd.product_id not in (1111,2222)
)
Of course, check indexes.

Resources