updating column based off of parent information - sql-server

I am a beginner at SQL Server and I have a question about how best to do this.
I have a table that looks like this:
ID Parent Level
1 NULL 0
2 1 1
3 1 1
4 2 2
5 2 2
6 3 2
7 2 2
8 5 4
9 4 3
10 6 3
11 6 3
As you can see, all the entries have a Parent and a Level and the database is organized in a tree structure. There are some entries where the Level is set incorrectly such as entry ID #8. The Parent of 8 is 5 and ID 5 has a level of 2 so the level of 8 should be 3 and not 4. There are many incorrect Level values in my table and I'm not sure how to fix this. So far I have this:
UPDATE myTable
SET level=level-1
FROM myTable
WHERE ???;
I am not sure how to fill in the WHERE part or whether this is the best way to do this. Any suggestions are gladly appreciated.

This will show you the rows that have issues.
select
a.id,
a.level,
b.level as parentlevel
from
tablename a
join tablename b on a.parent = b.id
where
a.level <> b.level+1

If you are using SQL Server 2005 or SQL Server 2008, then you can use a recursive CTE (common table expression). The books online article is pretty straight forward, but here's how you can do it with your code.
-- Create temporary table and insert values
CREATE TABLE dbo.ctetest (childid int primary key not null, parentid int null, level int null);
INSERT INTO dbo.ctetest (childid, parentid) SELECT 1, NULL;
INSERT INTO dbo.ctetest (childid, parentid) SELECT 2, 1;
INSERT INTO dbo.ctetest (childid, parentid) SELECT 3, 1;
INSERT INTO dbo.ctetest (childid, parentid) SELECT 4, 2;
INSERT INTO dbo.ctetest (childid, parentid) SELECT 5, 2;
INSERT INTO dbo.ctetest (childid, parentid) SELECT 6, 3;
INSERT INTO dbo.ctetest (childid, parentid) SELECT 7, 2;
INSERT INTO dbo.ctetest (childid, parentid) SELECT 8, 5;
INSERT INTO dbo.ctetest (childid, parentid) SELECT 9, 4;
INSERT INTO dbo.ctetest (childid, parentid) SELECT 10, 6;
INSERT INTO dbo.ctetest (childid, parentid) SELECT 11, 6;
-- Update table with level data from recursive CTE
WITH recursivecte (childid, parentid, level)
AS
(SELECT childid
, parentid
, 'level' = 0
FROM dbo.ctetest
WHERE parentid IS NULL
UNION ALL
SELECT ct.childid
, ct.parentid
, 'level' = rc.level + 1
FROM dbo.ctetest ct
JOIN recursivecte rc
ON ct.parentid = rc.childid)
UPDATE ct
SET level = rc.level
FROM dbo.ctetest ct
JOIN recursivecte rc
ON ct.childid = rc.childid;
-- Verify results
SELECT *
FROM dbo.ctetest;
Here's the output from the above query:
Child ID Parent ID Level
1 NULL 0
2 1 1
3 1 1
4 2 2
5 2 2
6 3 2
7 2 2
8 5 3
9 4 3
10 6 3
11 6 3
Please note I tested the above code using SQL Server 2008. I'm assuming it will work in SQL Server 2005 since CTE's were introduced in 2005.

Related

Most Efficient Way to Query a Table with Children, Grandchildren, etc

I have a table with a list of Jobs that have children, grandchildren, etc. There is no limit on the level of hierarchy it goes down. The table has ID, Name, and ParentID. So for example, a Job table could look like:
ID Name Parent ID
1 Education null
2 IT null
3 Teacher 1
4 MS Teacher 3
5 7th Grade 4
6 Sys Admin 2
7 HS Teacher 3
8 12th Grade 7
9 IT Support 6
10 Developer 2
There is also a UserToJob table that is just the JobID and UserID. A person could be listed in more than one Job.
I'm looking for the most efficient way to get all people with a specified job and all decedents, so if I want to query for Education then it returns Education, Teacher, MS Teacher, 7th Grade, HS Teacher, and 12th Grade.
Right now my best attempt looks like
with
Closure AS (
select j.ID as AncestorID, j.ID as DescendantID, 0 as Depth from Jobs j
UNION ALL
select CTE.AncestorID, j.ID, CTE.Depth + 1 from Jobs j
inner join Closure CTE on j.ParentID = CTE.DescendantID
),
Job AS ( select j.UserID as ID from UserToJob j
where j.JobID in (select DescendantID from Closure where AncestorID in (1))
)
I want it to be able to work querying more than one job at a time, for example if I wanted all Education and Sys Admins then I'd change AncestorID in (1) to AncestorID in (1, 6) in the final line of my attempt.
You have your where clause in the wrong place. You want to limit the root of your recursive cte to only return the first level rows you are concerned with.
This should point you in the right direction.
declare #Jobs table
(
ID int
, Name varchar(50)
, ParentID int
)
insert #Jobs values
(1, 'Education', null)
, (2, 'IT', null)
, (3, 'Teacher', 1)
, (4, 'MS Teacher', 3)
, (5, '7th Grade', 4)
, (6, 'Sys Admin', 2)
, (7, 'HS Teacher', 3)
, (8, '12th Grade', 7)
, (9, 'IT Support', 6)
, (10, 'Developer', 2)
select *
from #Jobs;
with Closure AS
(
select j.ID as AncestorID
, j.ID as DescendantID
, 0 as Depth
from #Jobs j
where j.ID in (1, 6)
UNION ALL
select CTE.AncestorID
, j.ID
, CTE.Depth + 1
from #Jobs j
inner join Closure CTE on j.ParentID = CTE.DescendantID
)
select *
from Closure

Selecting the smallest value in one column per group

I have a table that looks like the following which was created using the following code...
SELECT Orders.ID, Orders.CHECKIN_DT_TM, Orders.CATALOG_TYPE,
Orders.ORDER_STATUS, Orders.ORDERED_DT_TM, Orders.COMPLETED_DT_TM,
Min(DateDiff("n",Orders.ORDERED_DT_TM,Orders.COMPLETED_DT_TM)) AS
Time_to_complete
FROM Orders
GROUP BY Orders.ORDER_ID, Orders.ID,
Orders.CHECKIN_DT_TM, Orders.CATALOG_TYPE, Orders.ORDERED_DT_TM,
Orders.COMPLETED_DT_TM, HAVING (((Orders.CATALOG_TYPE)="radiology");
ID Time_to_complete ... .....
1 5
1 7
1 8
2 23
2 6
3 7
4 16
4 14
I'd like to add to this code which would select the smallest Time_to_complete value per subject ID. Leaving the desired table:
ID Time_to_complete ... .....
1 5
2 6
3 7
4 14
I'm using Access and prefer to continue using Access to finish this code but I do have the option to use SQL Server if this is not possible in Access. Thanks!
I suspect you need correlated subquery :
SELECT O.*, DateDiff("n", O.ORDERED_DT_TM, O.COMPLETED_DT_TM) AS Time_to_complete
FROM Orders O
WHERE DateDiff("n", O.ORDERED_DT_TM, O.COMPLETED_DT_TM) = (SELECT Min(DateDiff("n", O1.ORDERED_DT_TM, O1.COMPLETED_DT_TM))
FROM Orders O1
WHERE O1.ORDER_ID = O.ORDER_ID AND . . .
);
EDIT : If you want unique records then you can do instead :
SELECT O.*, DateDiff("n", O.ORDERED_DT_TM, O.COMPLETED_DT_TM) AS Time_to_complete
FROM Orders O
WHERE o.pk = (SELECT TOP (1) o1.pk
FROM Orders O1
WHERE O1.ORDER_ID = O.ORDER_ID AND . . .
ORDER BY DateDiff("n", O.ORDERED_DT_TM, O.COMPLETED_DT_TM) ASC
);
pk is your identity column that specifies unique entry in Orders table, so you can change it accordingly.
Have a look at this:
DECLARE #myTable AS TABLE (ID INT, Time_to_complete INT);
INSERT INTO #myTable
VALUES (1, 5)
, (1, 7)
, (1, 8)
, (2, 23)
, (2, 6)
, (3, 7)
, (4, 16)
, (4, 14);
WITH cte AS
(SELECT *
, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Time_to_complete) AS RN
FROM #myTable)
SELECT cte.ID
, cte.Time_to_complete
FROM cte
WHERE RN = 1;
Results :
ID Time_to_complete
----------- ----------------
1 5
2 6
3 7
4 14
It uses row numbers over groups, then selects the first row for each group. You should be able to adjust your code to use this technique. If in doubt wrap your entire query in a cte first then apply the technique here.
It's worth becoming familiar with this process as it gets used in a lot of places - especially around de-duping data.
Try This
DECLARE #myTable AS TABLE (ID INT, Time_to_complete INT);
INSERT INTO #myTable
VALUES (1, 5)
, (1, 7)
, (1, 8)
, (2, 23)
, (2, 6)
, (3, 7)
, (4, 16)
, (4, 14);
SELECT O.ID, O.Time_to_complete
FROM #myTable O
WHERE o.Time_to_complete = (Select min(m.Time_to_complete) FROM #myTable m
Where o.id=m.ID
);
Result :
ID Time_to_complete
1 5
2 6
3 7
4 14

SQL JOIN on list of IDs in a column

I've got a multi-tiered object in my database called MyFolder. MyFolder can be a child of another MyFolder at infinite levels. The table is defined as follows:
CREATE TABLE dbo.MyFolders
(
MyFolderId INT IDENTITY(1,1) NOT NULL,
ParentMyFolderId INT NULL,
Name NVARCHAR(50) NOT NULL,
Depth INT NOT NULL,
Ancestry NVARCHAR(max) NOT NULL,
CONSTRAINT PK_MyFolders PRIMARY KEY CLUSTERED (MyFolderId ASC),
CONSTRAINT FK_MyFolders_MyFolders FOREIGN KEY(ParentMyFolderId) REFERENCES dbo.MyFolders (MyFolderId)
)
It has data like:
MyFolderId ParentMyFolderId Name Depth Ancestry
1 NULL Folder1 0 /
2 1 Folder1A 1 /1/
3 1 Folder1B 1 /1/
4 1 Folder1C 1 /1/
5 4 Folder1C1 2 /1/4/
6 4 Folder1C2 2 /1/4/
7 6 Folder1C2a 3 /1/4/6/
8 6 Folder1C2b 3 /1/4/6/
This works quite well for everything needed in my system. However, it gets tricky if I want to retrieve a query like the following:
MyFolderId Name
1 Folder1
2 Folder1/Folder1A
3 Folder1/Folder1B
4 Folder1/Folder1C
5 Folder1/Folder1C/Folder1C1
6 Folder1/Folder1C/Folder1C2
7 Folder1/Folder1C/Folder1C2/Folder1C2a
8 Folder1/Folder1C/Folder1C2/Folder1C2b
Is there a way to JOIN on the ancestry field in order to get the ancestor names? Or another way using the ParentMyFolderId column? I do have a table-valued split string function called SplitString(value, delimiter).
This can be done using recursive queries, just append your current folder name to what you had previously.
Query:
;WITH Source (MyFolderId, ParentMyFolderId, Name, Depth, Ancestry)
AS (
SELECT 1, NULL, 'Folder1', 0, '/'
UNION ALL
SELECT 2, 1, 'Folder1A', 1, '/1/'
UNION ALL
SELECT 3, 1, 'Folder1B', 1, '/1/'
UNION ALL
SELECT 4, 1, 'Folder1C', 1, '/1/'
UNION ALL
SELECT 5, 4, 'Folder1C1', 2, '/1/4/'
UNION ALL
SELECT 6, 4, 'Folder1C2', 2, '/1/4/'
UNION ALL
SELECT 7, 6, 'Folder1C2a', 3, '/1/4/6/'
UNION ALL
SELECT 8, 6, 'Folder1C2b', 3, '/1/4/6/'
),
cte AS
(
SELECT S.MyFolderID, S.ParentMyFolderId, CAST(S.Name AS VARCHAR(MAX)) AS Name
FROM Source AS S
WHERE ParentMyFolderId IS NULL
UNION ALL
SELECT S.MyFolderID, S.ParentMyFolderId, C.Name + '/' + S.Name
FROM Source AS S
INNER JOIN cte AS C
ON C.MyFolderId = S.ParentMyFolderId
)
SELECT *
FROM cte
Here's a recursive CTE as mentioned in the comments:
WITH TreeStructure(MyFolderId, Name) AS
(
SELECT MyFolderId, CONVERT(varchar(500), Name)
FROM MyFolders WHERE ParentMyFolderId IS NULL
UNION ALL
SELECT sd.MyFolderId, CONVERT(varchar(500), t.Name + '/' + sd.Name)
FROM MyFolders sd
JOIN TreeStructure t ON sd.ParentMyFolderId = t.MyFolderId
WHERE sd.ParentMyFolderId IS NOT NULL
)
SELECT * FROM TreeStructure
Results:
MyFolderId Name
----------- ----------------------------------------
1 Folder1
2 Folder1/Folder1A
3 Folder1/Folder1B
4 Folder1/Folder1C
5 Folder1/Folder1C/Folder1C1
6 Folder1/Folder1C/Folder1C2
7 Folder1/Folder1C/Folder1C2/Folder1C2a
8 Folder1/Folder1C/Folder1C2/Folder1C2b

Search within ColA duplicates against specific unique vals in ColB to exclude all of ColA

I apologize in advance I feel like I'm missing something really stupid simple. (and let's ignore database structure as I'm kind of locked into that).
I have, let's use customer orders - an order number can be shipped to more than one place. For the sake of ease I'm just illustrating three but it could be more than that (home, office, gift, gift2, gift 3, etc)
So my table is:
Customer orders:
OrderID MailingID
--------------------
1 1
1 2
1 3
2 1
3 1
3 3
4 1
4 2
4 3
What I need to find is OrderIDs that have been shipped to MailingID 1 but not 2 (basically what I need to find is orderID 2 and 3 above).
If it matters, I'm using Sql Express 2012.
Thanks
Maybe this could help:
create table #temp(
orderID int,
mailingID int
)
insert into #temp
select 1, 1 union all
select 1, 2 union all
select 1, 3 union all
select 2, 1 union all
select 3, 1 union all
select 3, 3 union all
select 4, 1 union all
select 4, 2 union all
select 4, 3
-- find orderIDs that have been shipeed to mailingID = 1
select
distinct orderID
from #temp
where mailingID = 1
except
-- find orderIDs that have been shipeed to mailingID = 2
select
orderID
from #temp
where mailingID = 2
drop table #temp
A simple Subquery With NOT IN Operator should work.
SELECT DISTINCT OrderID
FROM <tablename> a
WHERE orderid NOT IN (SELECT orderid
FROM <tablename> b
WHERE b.mailingID = 2)

Tsql group by clause with exceptions

I have a problem with a query.
This is the data (order by Timestamp):
Data
ID Value Timestamp
1 0 2001-1-1
2 0 2002-1-1
3 1 2003-1-1
4 1 2004-1-1
5 0 2005-1-1
6 2 2006-1-1
7 2 2007-1-1
8 2 2008-1-1
I need to extract distinct values and the first occurance of the date. The exception here is that I need to group them only if not interrupted with a new value in that timeframe.
So the data I need is:
ID Value Timestamp
1 0 2001-1-1
3 1 2003-1-1
5 0 2005-1-1
6 2 2006-1-1
I've made this work by a complicated query, but am sure there is an easier way to do it, just cant think of it. Could anyone help?
This is what I started with - probably could work with that. This is a query that should locate when a value is changed.
> SELECT * FROM Data d1 join Data d2 ON d1.Timestamp < d2.Timestamp and
> d1.Value <> d2.Value
It probably could be done with a good use of row_number clause but cant manage it.
Sample data:
declare #T table (ID int, Value int, Timestamp date)
insert into #T(ID, Value, Timestamp) values
(1, 0, '20010101'),
(2, 0, '20020101'),
(3, 1, '20030101'),
(4, 1, '20040101'),
(5, 0, '20050101'),
(6, 2, '20060101'),
(7, 2, '20070101'),
(8, 2, '20080101')
Query:
;With OrderedValues as (
select *,ROW_NUMBER() OVER (ORDER By TimeStamp) as rn --TODO - specific columns better than *
from #T
), Firsts as (
select
ov1.* --TODO - specific columns better than *
from
OrderedValues ov1
left join
OrderedValues ov2
on
ov1.Value = ov2.Value and
ov1.rn = ov2.rn + 1
where
ov2.ID is null
)
select * --TODO - specific columns better than *
from Firsts
I didn't rely on the ID values being sequential and without gaps. If that's the situation, you can omit OrderedValues (using the table and ID in place of OrderedValues and rn). The second query simply finds rows where there isn't an immediate preceding row with the same Value.
Result:
ID Value Timestamp rn
----------- ----------- ---------- --------------------
1 0 2001-01-01 1
3 1 2003-01-01 3
5 0 2005-01-01 5
6 2 2006-01-01 6
You can order by rn if you need the results in this specific order.

Resources