Optimal way to select data in a tree

Optimal way to select data in a tree - sql-server

I have the following table relationship.
[Organisation] 1--* [UserOrganisationPermission] *--1 [User]
Finding a list of Organisation for a User is obvious pretty easy...
select O.Id
from UserOrganisationPermission p
join Organisation o on p.OrganisationId = o.Id
where p.UserId = #SomeUserId
My Organisation table is self-referencing via a ParentId column, enabling me to have a tree-like structure of organisations.
If the User has permission to an Organisation then they implicitly have been granted permission to all organisations down the tree.
I need to find a way of easily selecting those organisations.
So far I have tried adding a Path varchar(900) column to the Organisation table that contains a delimited list of int Ids in its path. It works like this
Whenever a new Organisation is inserted: If it's ParentId is null then its Path is simply -id-, if ParentId is not null then it's Path is its parent's Path with id- appended.
Whenever an Organisation is updated: If its ParentId has changed then I perform an Update Organisation command that selects all Path columns that started with its previous path value, and replaces that part of the Path with the new path.
e.g.
Path -1-
Path -1-2-
Path -1-2-3
If I change the ParentId of 2 to null it will update all Organisation rows that have a Path starting with -1-2- and replace -1-2- with -2-
Path -1-
Path -2-
Path -2-3
This way I can select all Organisation sub nodes like so
select O.Id
where O.Path like '-2-%`
Which would give me -2- and -2-3-.
I can't help but think there is a far more simple way of achieving this goal. Is there a far more simple way I am missing?

You could try to build your path with a recursive cte in stead of maintaning it in an actual column.
Maybe this updated DBFiddle can get you started
with Hierarchy(id, parentid, Path) as
( select o.id, o.parentid, convert(varchar(max), o.id)
from organisation o
left join organisation o2 on o.parentid = o2.id
where o.parentid is null
union all
select o.id, o.parentid, convert(varchar(max), s.Path + '-' + o.id)
from Hierarchy s
inner join organisation o on s.id = o.parentid
)
select s.id, s.parentid, '-' + s.Path + '-'
from Hierarchy s
order by id
option (maxrecursion 0);
result is
id
parentid
Path
1
-1-
2
1
-1-2-
3
2
-1-2-3-
4
-4-
5
4
-4-5-

Related

SQL - join two tables based on up-to-date entries

I have two tables
1- Table of TestModules
TestModules
2- Table of TestModule_Results
TestModule_Results
in order to get the required information for each TestModule, I am using FULL OUTER JOIN and it works fine.
FULL OUTER JOIN result
But what is required is slightly different. The above picture shows that TestModuleID = 5 is listed twice, and the requirement is to list the 'up-to-date' results based on time 'ChangedAt'
Of course, I can do the following:
SELECT TOP 1 * FROM TestModule_Results
WHERE DeviceID = 'xxx' and TestModuleID = 'yyy'
ORDER BY ChangedAt DESC
But this solution is for a single row and I want to do it in a Stored Procedure.
Expected output should be like:
ExpectedOutput
Any advise how can I implement it in a SP?

Use a Common Table Expression and Row_Number to add a field identifying the newest results, if any, and select for just those
--NOTE: a Common Table Expression requires the previous command
--to be explicitly terminiated, prepending a ; covers that
;WITH cteTR as (
SELECT *
, ROW_NUMBER() OVER (PARTITION BY DeviceID, TestModuleID
ORDER BY ChangedAt DESC) AS ResultOrder
FROM TestModule_Results
--cteTR is now just like TestModule_Results but has an
--additional field ResultOrder that is 1 for the newest,
--2 for the second newest, etc. for every unique (DeviceID,TestModuleID) pair
)
SELECT *
FROM TestModules as M --Use INNER JOIN to get only modules with results,
--or LEFT OUTER JOIN to include modules without any results yet
INNER JOIN cteTR as R
ON M.DeviceID = R.DeviceID AND M.TestModuleID = R.TestModuleID
WHERE R.ResultOrder = 1
-- OR R.ResultOrder IS NULL --add if Left Outer Join

You say "this solution is for a single row"? Excellent. Use CROSS APPLY and change the WHERE clause from hand-input literal to the fields of the original table. APPLY operates at row level.
SELECT *
FROM TestModules t
CROSS APPLY
(
SELECT TOP 1 * FROM TestModule_Results
WHERE TestModule_Results.DeviceID = TestModules.DeviceID -- put the connecting fields here
ORDER BY ChangedAt DESC
)tr

Find Child with Parent having specific information

I am trying to find children whose parent have some specific information from different relational tables.
I have four tables as shown below
Search Criteria : Get all the "Section" who has parent as "Inventory" level with attached User name containing 'a' letter and role id is 'employee' (Please see LevelsUser table for relation).
I tried CTE (common table expression') approach to find the correct Section level but here I have to pass level Id as hard coded value and I cannot search all Section in the table.
WITH LevelsTree AS
(
SELECT Id, ParentLevelId, Level
FROM Levels
WHERE Level='Section' // here i need to pass value
UNION ALL
SELECT ls.Id, ls.ParentLevelId, ls.Level
FROM Levels ls
JOIN LevelsTree lt ON ls.Id = lt.ParentLevelId
)
SELECT * FROM LevelsTree
I need to find all sections match the above criteria.
Please help me here.

For hierarchical checks you need to select from and then join to the same table Levels. So something like this should help you:
declare #parentLevelName varchar(20) = 'Inventory';
with cte as (
select distinct
l1.id,
l1.Level
from Levels l1
join Levels l2 on l2.id=l1.ParentLevelId
and l2.Level = #parentLevelName -- use variable instead of hardcoded `Inventory`
where l1.Level='Section' -- replace `Section` with #var containing your value
) select * from cte
join LevelUsers lu on lu.LevelId=cte.id
join Users u on u.Id = lu.UserId
and u.UserName like '%a%' -- this letter check is not efficient
join Role r on r.id=lu.RoleId and r.Role='employee'
Note, the above query selects data only from the 4 tables which you have described in DB schema. However, you original query contains a reference to the HierarchyPosition table which you haven't described. If you really need to include HiearchyPosition reference then specify how it relates to the other 4 tables.
Also note, condition and u.UserName like '%a%' used to satisfy your requirement of User name containing 'a' letter is not efficient because of the leading %, which prevents the use of indexes. Consider changing your requirements if possible to User name starts with 'a' letter. This way and u.UserName like 'a%' will allow the use of index over Users table if it exists.
HTH

TSQL Group By Issues

I have a TSQL query that I am trying to group data on. The table contains records of users and the access keys they hold such as site admin, moderator etc. The PK is on User and access key because a user can exist multiple times with different keys.
I am now trying to display a table of all users and in one column, all of the keys that user holds.
If bob had three separate records for his three separate access keys, result should only have One record for bob with all three of is access levels.
SELECT A.[FirstName],
A.[LastName],
A.[ntid],
A.[qid],
C.FirstName AS addedFirstName,
C.LastName AS addedLastName,
C.NTID AS addedNTID,
CONVERT(VARCHAR(100), p.TIMESTAMP, 101) AS timestamp,
(
SELECT k.accessKey,
k.keyDescription
FROM TFS_AdhocKeys AS k
WHERE p.accessKey = k.accessKey
FOR XML PATH ('key'), TYPE, ELEMENTS, ROOT ('keys')
)
FROM TFS_AdhocPermissions AS p
LEFT OUTER JOIN dbo.EmployeeTable as A
ON p.QID = A.QID
LEFT OUTER JOIN dbo.EmployeeTable AS C
ON p.addedBy = C.QID
GROUP BY a.qid
FOR XML PATH ('data'), TYPE, ELEMENTS, ROOT ('root');
END
I am trying to group the data by a.qid but its forcing me to group on every column in the select which will then not be unique so it will contain the duplicates.
Whats another approach to handle this?
Currently:
UserID | accessKey
123 | admin
123 | moderator
Desired:
UserID | accessKey
123 | admin
moderator

Recently, I was working on something and had a similar problem. Like your query, I had an inner 'for xml' with joins in the outer 'for xml'. It turned out it worked better if the joins were in the inner 'for xml'. The code is pasted below. I hope this helps.
Select
(Select Institution.Name, Institution.Id
, (Select Course.Courses_Id, Course.Expires, Course.Name
From
(Select Course.Courses_Id, Course.Expires, Courses.Name
From Institutions Course Course Join Courses On Course.Courses_Id = Courses.Id
Where Course.Institutions_Id = 31) As Course
For Xml Auto, Type, Elements) As Courses
From Institutions Institution
For Xml Auto, Elements, Root('Institutions') )

As I don't have the definitions for the other tables you have I just make a sample test data and you can follow this to answer yours.
Create statement
CREATE TABLE #test(UserId INT, AccessLevel VARCHAR(20))
Insert sample data
INSERT INTO #test VALUES(123, 'admin')
,(123, 'moderator')
,(123, 'registered')
,(124, 'moderator')
,(124, 'registered')
,(125, 'admin')
By using ROW_NUMBER() you can achieve what you need
;WITH C AS(
SELECT ROW_NUMBER() OVER(PARTITION BY UserId ORDER BY UserId) As Rn
,UserId
,AccessLevel
FROM #test
)
SELECT CASE Rn
WHEN 1 THEN UserId
ELSE NULL
END AS UserId
,AccessLevel
FROM C
Output
UserId AccessLevel
------ -----------
123 admin
NULL moderator
NULL registered
124 moderator
NULL registered
125 admin

SQL SELECT from SELECT

I am trying to build a single select statement from two separate ones.
Basically I have a list of Names in a table which do repeat like so:
Name| Date
John 2014-11-22
John 2013-02-03
Joe 2012-12-12
Jack 2011-11-11
Bob 2010-10-01
Bob 2013-12-22
I need to do a Select distinct Name from Records which returns John, Joe, Jack, Bob.
I then want to so a Select on another table where I pass in the rows returned above.
SELECT Address, Phone From dbo.Details
WHERE Name = {Values from first SELECT query}
Having trouble with the syntax.

If you do not want to return any values from the subquery, you can use either IN or EXISTS
SELECT Address, Phone From dbo.Details
WHERE Name IN (SELECT DISTINCT Name FROM Records)
-- OR --
SELECT Address, Phone From dbo.Details D
WHERE EXISTS (SELECT 1 FROM Records R WHERE R.Name = D.Name)
(In most RDBMS the EXISTS is less resource intensive).
If you want to return values from the subquery, you should use JOIN
SELECT
D.Address,
D.Phone,
R.Name -- For example
FROM
dbo.Details D
INNER JOIN dbo.Records R
ON D.Name = R.Name
SIDENOTE These are sample queries, it is possible that you have to fine tune them to match your exact requirements.

You can use:
SELECT Address, Phone, name
FROM details
-- "in" is the difference from your first query, needed due to multiple values being returned by the subquery
WHERE name in (
SELECT distinct name
FROM namesTable
)
Additionally the following should work:
SELECT d.Address, d.Phone, n.name
FROM details d
inner join (
select distinct name
from namesTable
) n on d.name = n.name

So there are two ways you can go about doing this. One, create a temporary table and perform a join (*actually in retrospect you could also join to your second table as a subquery, or use something like a CTE if you're using SQL SERVER, but the modifications if you wanted to go that route should be pretty obvious)
CREATE TEMPORARY TABLE my_table AS
{your first select query};
SELECT Address, Phone From dbo.Details
INNER JOIN my_table AS mt
ON mt.name = dbo.name
Another option would be to perform an IN or EXISTS query using your select query
SELECT Address, Phone From dbo.Details
WHERE name IN (SELECT name from my_table)
Or, better yet (eg SQL Server IN vs. EXISTS Performance),
SELECT Address, Phone From dbo.Details
WHERE EXISTS (SELECT * from my_table WHERE my_table.name = dbo.name)
You might have to modify the syntax slightly, depending on if you are using MySQL or SQL Server (not sure about that later, honestly). But this should get you started down the right path

This will give you the names and their address and phone number:
SELECT DISTINCT N.Name, D.Address, D.Phone
FROM dbo.Details D INNER JOIN dbo.Names N ON D.Name = N.Name

When using a subquery that is not scalar (doesn't return only one value) in the where clause use IN and of course only one column in the subquery:
SELECT Address, Phone
From dbo.Details
WHERE Name IN (Select Name from Table)

Full text searching scores across multiple columns

I am using full text searching on a SQL Server database to return results from multiple tables. The simplest situation would be searching a persons name fields and a description field. The code I use to do this looks like:
select t.ProjectID as ProjectID, sum(t.rnk) as weightRank
from
(
select KEY_TBL.RANK * 1.0 as rnk, FT_TBL.ProjectID as ProjectID
FROM Projects as FT_TBL
INNER JOIN FREETEXTTABLE(Projects, Description, #SearchText) AS KEY_TBL
ON FT_TBL.ProjectID=KEY_TBL.[KEY]
union all
select KEY_TBL.RANK * 50 as rnk, FT_TBL.ProjectID as ProjectID
FROM Projects as FT_TBL
... <-- complex unimportant join
INNER JOIN People as p on pp.PersonID = p.PersonID
INNER JOIN FREETEXTTABLE(People, (FirstName, LastName), #SearchText) AS KEY_TBL
ON p.PersonID=KEY_TBL.[KEY]
)
group by ProjectID
As is (hopefully) clear above, I am trying to weight heavily on matches of a person's name over matches in a project description field. If I do a search for something like 'john' all projects with a person named john on it will be heavily weighted (as expected). The issue I am having is on searches where someone provides a full name like 'john smith'. In this case the match is much less strong on name as (I presume) only half the search terms are matching in each of the firstname / lastname columns. In many cases this means someone with an exact match of the name entered will not necessarily be returned near the top of the search results.
I have been able to correct this by searching each of the firstname / lastname fields separately and adding their scores together so my new query looks like:
select t.ProjectID as ProjectID, sum(t.rnk) as weightRank
from
(
select KEY_TBL.RANK * 1.0 as rnk, FT_TBL.ProjectID as ProjectID
FROM Projects as FT_TBL
INNER JOIN FREETEXTTABLE(Projects, Description, #SearchText) AS KEY_TBL
ON FT_TBL.ProjectID=KEY_TBL.[KEY]
union all
select KEY_TBL.RANK * 50 as rnk, FT_TBL.ProjectID as ProjectID
FROM Projects as FT_TBL
... <-- complex unimportant join
INNER JOIN People as p on pp.PersonID = p.PersonID
INNER JOIN FREETEXTTABLE(People, (FirstName), #SearchText) AS KEY_TBL
ON p.PersonID=KEY_TBL.[KEY]
union all
select KEY_TBL.RANK * 50 as rnk, FT_TBL.ProjectID as ProjectID
FROM Projects as FT_TBL
... <-- complex unimportant join
INNER JOIN People as p on pp.PersonID = p.PersonID
INNER JOIN FREETEXTTABLE(People, (LastName), #SearchText) AS KEY_TBL
ON p.PersonID=KEY_TBL.[KEY]
)
group by ProjectID
My question:
Is this the approach I should be taking, or is there some way to have the full text searching operate on a list of columns as though it were a blob of text: i.e. treat the firstname and lastname columns as a single name column, resulting in a higher scoring match for strings including both the persons first and last name?

I have recently run into this and have used a computed column to concatenate the required columns together into one string and then have the full text index on that column.
I have achieved the weighting by duplicating the weighted fields in the computed column.
i.e. last name appears 3 times and first name once.
ALTER TABLE dbo.person ADD
PrimarySearchColumn AS
COALESCE(NULLIF(forename,'') + ' ' + forename + ' ', '') +
COALESCE(NULLIF(surname,'') + ' ' + surname + ' ' + surname + ' ', '') PERSISTED
You must make sure you use the persisted keyword so that the column isnt computed on each read.

I know this is an old question but I've come across the same issue and solved it a different way.
Rather than add computed columns to the original tables, which may not always be an option, I have created indexed views which contain the combined fields. To use the original example:
CREATE VIEW [dbo].[v_PeopleFullName]
WITH SCHEMABINDING
AS SELECT dbo.People.PersonID, ISNULL(dbo.People.FirstName + ' ', '') + dbo.People.LastName AS FullName
FROM dbo.People
GO
CREATE UNIQUE CLUSTERED INDEX UQ_v_PeopleFullName
ON dbo.[v_PeopleFullName] ([PersonID])
GO
Then I join that view in my query, along with the existing full-text predicate on the individual columns in the base table, so that I can find exact matches and partial matches in the individual columns, like so:
DECLARE #SearchText NVARCHAR(100) = ' "' + #OriginalSearchText + '" ' --For matching exact phrase
DECLARE #SearchTextWords NVARCHAR(100) = ' "' + REPLACE(#OriginalSearchText, ' ', '" OR "') + '" ' --For matching on words in phrase
SELECT FT_TBL.ProjectID as ProjectID,
ISNULL(KEY_TBL.[Rank], 0) + ISNULL(KEY_VIEW.[Rank], 0) AS [Rank]
FROM Projects as FT_TBL
INNER JOIN People as p on FT_TBL.PersonID = p.PersonID
LEFT OUTER JOIN CONTAINSTABLE(People, (FirstName, LastName), #SearchTextWords) AS KEY_TBL ON p.PersonID = KEY_TBL.[KEY] INNER JOIN
LEFT OUTER JOIN CONTAINSTABLE(v_PeopleFullName, FullName, #SearchText) AS KEY_VIEW ON p.PersonID = KEY_VIEW.[Key]
WHERE ISNULL(KEY_TBL.[Rank], 0) + ISNULL(KEY_VIEW.[Rank], 0) > 0
ORDER BY [Rank] DESC
Some notes on this:
I'm using CONTAINSTABLE rather than FREETEXTTABLE as it seems more appropriate to me for searching names. I'm not interested in finding words with similar meaning or inflections of words when it's names that I'm searching on.
Because I'm using CONTAINSTABLE I'm having to do some pre-processing on the #SearchText variable to make it compatible and to break it down into individual words with the OR operator for searching on the base table's full-text index.
Rather than using a UNION query to join separate queries each using a single, joined CONTAINSTABLE I'm joining on both CONTAINSTABLE predicates in the same query. This means using outer joins rather than inner joins, so I'm then using a WHERE clause to exclude any records from the base table which don't match on either full-text index. I confess that I haven't made any examination of how this performs compared to separate queries each with a single full-text index predicate UNIONised to produce a single result set.
Although there's no guarantee that the Rank of matches on the full search text in the indexed view will be higher than that of matches on individual words in the full-text index on the base table's columns because the Rank value is arbitrary, my testing so far has shown that in practice it always is (so far!).

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight