I have a table listing the parent child relationship for each element like this:
ParentID ParentTitle ChildId ChildTitle
----------------------------------------------
843 Documents 38737 Jobs
843 Documents 52537 Tools
843 Documents 5763 SecondOps
843 Documents 4651 Materials
38737 Jobs 16619 Job001
38737 Jobs 16620 Job002
38737 Jobs 16621 Job003
38737 Jobs 16622 Job004
38737 Jobs 16623 Job005
52537 Tools 1952 HandTools
52537 Tools 1953 Automated
52537 Tools 1957 Custom
1952 HandTools 12 Cordless10mm
1952 HandTools 13 Cordless8mm
1952 HandTools 14 CableCrimp
1952 HandTools 15 Cutter
1952 HandTools 16 EdgePlane
5763 SecondOps 101 Procedure001
5763 SecondOps 102 Procedure002
5763 SecondOps 103 Procedure003
4651 Materials 33576 Raw
4651 Materials 33577 Mixed
4651 Materials 33578 Hybrid
4651 Materials 33579 Custom
16622 Job004 101 Procedure001
16622 Job004 14 CableCrimp
16622 Job004 15 Cutter
16622 Job004 4651 Mixed
16623 Job005 102 Procedure002
16623 Job005 103 Procedure003
16623 Job005 16619 Job001
16623 Job005 1953 Automated
16623 Job005 33579 Custom
16623 Job005 33576 Raw
I would like to get the full path of each Combination using the IDs, for example
Documents\Jobs\Job003 = 843\38737\16621
Another example would be "Procedure001" which is listed in 2 places
Documents\SecondOps\Procedure001 = 843\5763\101
The same document is also referenced here:
Documents\Jobs\Job004\Procedure001 = 843\38737\16622\101
I'd like to take this table and build a TreeView in .NET. So having the full path for each item would make it a cake walk.
Otherwise, I was thinking that I could start at the Root page and keep recursing through the parents, building a child list, then recursing those, etc.
Is there a better way to query this to build those paths? This list has 400,000 records so if there is a more efficient way it would save time
This was all originally in an AS400 system DB until 2000ish then made into a MediaWiki site. I am pulling the data via the api with the intent of building an interface for a SQL Server database.
I can do basic SQL queries, joins, unions, etc.
Let me know what other info I can provide if this isn't clear
You could use INNER JOIN and LEFT JOIN if you are using SQL SERVER MS, and here are how the query look like, which will give you the full result (combination) based on your requirement:
SELECT A.ParentTitle + '\'+B.ParentTitle+
CASE WHEN C.ParentTitle IS NOT NULL THEN '\' +C.ParentTitle
ELSE ''
END
+
' =' + A.ParentID + '\'+B.ParentID+
CASE WHEN C.ParentID IS NOT NULL THEN '\' +C.ParentID
ELSE ''
END
FROM TABLE AS A
INNER JOIN TABLE AS B
ON B.ParentID = A.ChildId
LEFT JOIN TABLE AS C
ON C.ParentID = B.ChildId
Not 100% sure whether it will work as I expected or not, please give it a try xD
A tree structure means Recursion for a generic solution.
Pls, don't try this in sql. Just take datarow from sql into a list or something like and make populate with recursion in a programming language.
Your tree class wil be like :
public class MyObj {
public int Id {get; set;}
public string Title {get; set;}
public List<MyObj> {get; set; } = null;
}
0.You table its pretty wrong. The corect way will be :
CREATE TABLE Jobs(
Id int not null primary key,
Title nvarchar(255) not null,
StartTime datetime,--optional maybe will help
ParentId int null --can be null root will have no parent
)
But I will try to explain on your table how it's done.
I will suppose that you have some kind datacontext (DBML,EDMX etc.)
Find root or roots. In your case root will those nr that are on ParentID and are not on the ChildId.
Query that will list your roots:
SELECT DISTINCT a.ParentId FROM
YourTable a LEFT JOIN
YourTable b ON a.ParentId=b.ChildId
WHERE b.ParentId is null
Make a recursive procedure that will retrive your data in a class structure as above(MyObj).
procedure MyObj GetTree(int id, db){
if (db.YourTable.Any(r => r.ParentId==Id)){
var q=db.YourTable.Select(r => r.ParentId==Id).ToList();
var result = new MyObj{
Id = q.ParentId,
Title = q.ParentTitle,
Children = new List<MyObj>()
}
foreach( var el in q) {
if (db.YourTable.Any(r => r.ParentId==el.ChildId))
result.Children.Add(GetTree(el.ChildId,db))
else
result.Children.Add( new MyObj{
Id = el.ChildId,
Title = el.ChildTitle,
Children = null
});
return result;
}
}
return null;
}
make trees with list Id from point 1 stored in a list let's say ListIds you will do something like that:
List finaltrees = new List()
Ids.ForEach(id => finaltrees.Add(GetTree(id,dbcontext));
Now you have a tree structure in finaltrees.
PS:
I wrote the code directly in browser (C#),there can be some typos error.
So to elaborate on what I am trying to do, I'm working with a wiki version that doesn't use namespaces to establish document paths.
For example if a page is 3 levels deep on a document tree like this
RootPage
Page01
Page02
Page03
Page04
Using the Namespace approach Page03's Name(Path) is "RootPage:Page01:Page02:Page03"
I would Like to do the same thing with the PageIDs
So given this example you would have
PageTitle PageId Path
RootPage 001 001
Page01 101 001:101
Page02 201 001:101:201
Page03 301 001:101:201:301
Page04 302 001:101:201:302
So now All I have to do is Put the PagePath together.
There are several challenges to Consider with this wiki
No 2 documents can have the same TITLE
Document IDs are basically
irrelevant, but handy in this case(at least in the version I am
working on)
Thankfully there is a list of Pages and their "Links" or
Child Pages. I believe you would call it a MANY to MANY
The Key Point to remember is even if a page is listed as a child of many other pages, Only one really exists and I only need one of them in the results.
So Using LONG's example here is where I've gotten to
Using this Table:
CREATE Table [dbo].[ExampleTable](
[RecordID] Int IDENTITY (1, 1) Not NULL,
[ParentID] Int Not NULL,
[ParentTitle] VARCHAR(800) NULL,
[ChildID] Int Not NULL,
[ChildTitle] VARCHAR(800) NULL,
PRIMARY KEY CLUSTERED ([RecordID] ASC),);
This Data:
INSERT INTO [dbo].[ExampleTable]
([ParentID]
,[ParentTitle]
,[ChildID]
,[ChildTitle])
VALUES
(843,'Documents',38737,'Jobs'),
(843,'Documents',52537,'Tools'),
(843,'Documents',5763,'SecondOps'),
(843,'Documents',4651,'Materials'),
(38737,'Jobs',16619,'Job001'),
(38737,'Jobs',16620,'Job002'),
(38737,'Jobs',16621,'Job003'),
(38737,'Jobs',16622,'Job004'),
(38737,'Jobs',16623,'Job005'),
(52537,'Tools',1952,'HandTools'),
(52537,'Tools',1953,'Automated'),
(52537,'Tools',1957,'Custom'),
(1952,'HandTools',12,'Cordless10mm'),
(1952,'HandTools',13,'Cordless8mm'),
(1952,'HandTools',14,'CableCrimp'),
(1952,'HandTools',15,'Cutter'),
(1952,'HandTools',16,'EdgePlane'),
(5763,'SecondOps',101,'Procedure001'),
(5763,'SecondOps',102,'Procedure002'),
(5763,'SecondOps',103,'Procedure003'),
(4651,'Materials',33576,'Raw'),
(4651,'Materials',33577,'Mixed'),
(4651,'Materials',33578,'Hybrid'),
(4651,'Materials',33579,'Custom'),
(16622,'Job004',101,'Procedure001'),
(16622,'Job004',14,'CableCrimp'),
(16622,'Job004',15,'Cutter'),
(16622,'Job004',4651,'Mixed'),
(16623,'Job005',102,'Procedure002'),
(16623,'Job005',103,'Procedure003'),
(16623,'Job005',16619,'Job001'),
(16623,'Job005',1953,'Automated'),
(16623,'Job005',33579,'Custom'),
(16623,'Job005',33576,'Raw')
GO
And This Query, Which I modified from LONG's example:
SELECT DISTINCT C.ChildTitle as PageTitle, convert(varchar(20),A.ParentID) + ':' + convert(varchar(20),B.ParentID) +
CASE WHEN C.ParentID IS NOT NULL THEN ':' + convert(varchar(20),C.ParentID)
ELSE ''
END
+
CASE WHEN C.ChildID IS NOT NULL THEN ':' + convert(varchar(20),C.ChildID)
ELSE ''
END
FROM ExampleTable AS A
INNER JOIN ExampleTable AS B
ON B.ParentID = A.ChildId
LEFT JOIN ExampleTable AS C
ON C.ParentID = B.ChildId
ORDER By PageTitle
I get These Results:
PageTitle UnNamed
NULL 16622:4651
NULL 38737:16622
NULL 38737:16623
NULL 52537:1952
NULL 843:38737
NULL 843:4651
NULL 843:52537
NULL 843:5763
Automated 843:38737:16623:1953
CableCrimp 843:38737:16622:14
CableCrimp 843:52537:1952:14
Cordless10mm 843:52537:1952:12
Cordless8mm 843:52537:1952:13
Custom 38737:16622:4651:33579
Custom 843:38737:16623:33579
Cutter 843:38737:16622:15
Cutter 843:52537:1952:15
EdgePlane 843:52537:1952:16
Hybrid 38737:16622:4651:33578
Job001 843:38737:16623:16619
Mixed 38737:16622:4651:33577
Mixed 843:38737:16622:4651
Procedure001 843:38737:16622:101
Procedure002 843:38737:16623:102
Procedure003 843:38737:16623:103
Raw 38737:16622:4651:33576
Raw 843:38737:16623:33576
What I'd like to get is a SINGLE occurance of each page, Regarless of which Parent it happens to be found
Then I can use these Paths to turn the Virtual Tree Structure into an actual Tree Structure.
The Last Issue is that the actual Link List is VERY similar to the example I created, except that it has 400,000 records.
When I run this query against the actual "Link List" it runs for about 17 minutes and runs out of memory.
I've been researching the MAXRECURSION option, but I am still working on it, don't know if that is problem or not.
So I am working with a hive table that is set up as so:
id (Int), mapper (String), mapperId (Int)
Basically a single Id can have multiple mapperIds, one per mapper such as an example below:
ID (1) mapper(MAP1) mapperId(123)
ID (1) mapper(MAP2) mapperId(1234)
ID (1) mapper(MAP3) mapperId(12345)
ID (2) mapper(MAP2) mapperId(10)
ID (2) mapper(MAP3) mapperId(12)
I want to return the list of mapperIds associated to each unique ID. So for the above example I would want the below returned as a single row.
1, 123, 1234, 12345
2, null, 10, 12
The mapper Strings are known, so I was thinking of doing a self join for every mapper string I am interested in, but I was wondering if there was a more optimal solution?
If the assumption that the mapper column is distinct with respect to a given ID is correct, you could collect the mapper column and the mapperid column to a Map using brickhouse collect. You can clone the repo from that link and build the jar with Maven.
Query:
add jar /complete/path/to/jar/brickhouse-0.7.0-SNAPSHOT.jar;
create temporary function collect as 'brickhouse.udf.collect.CollectUDAF';
select id
,id_map['MAP1'] as mapper1
,id_map['MAP2'] as mapper2
,id_map['MAP3'] as mapper3
from (
select id
,collect(mapper, mapperid) as id_map
from some_table
group by id
) x
Output:
| id | mapper1 | mapper2 | mapper3 |
------------------------------------
1 123 1234 12345
2 10 12
I tried to solve one query from last 2 days but didn't.
It looks easy to understand but can't made.
There are two column in Table for example:
ResourceId || MappingId
1 2
1 3
2 2
2 4
3 2
3 4
4 2
4 5
5 2
5 4
This is one table which have two fields ResourceId and MappingId.
Now I want resourceId which have Mappingid {2,4}
Means answer must be ResourceId {2,3,5}
How can I get this answer in Linq Query?
Use Contains of collection. This method can be translated by Entity Framework into SQL IN operator:
int[] mappingIds = { 2, 4 };
var resources = from t in table
where mappingIds.Contains(t.MappingId)
select t.ResourceId;
Lambda syntax:
var resources = table.Where(t => mappingIds.Contains(t.MappingId))
.Select(t => t.ResourceId);
Generated SQL will look like:
SELECT [Extent1].[ResourceId]
FROM [dbo].[TableName] AS [Extent1]
WHERE [Extent1].[MappingId] IN (2,4)
UPDATE: If you want to get resources which have ALL provided mapping ids, then
var resources = from t in table
group t by t.ResourceId into g
where mappingIds.All(id => g.Any(t => t.Id == id))
select g.Key;
Entity Framework is able to translate this query into SQL, but it will not be that beautiful as query above.
IQueryable<int> resourceIds = table
// groups items by ResourceId
.GroupBy(item => item.ResourceId)
// consider only group where: an item has 2 as MappingId value
.Where(group => group.Select(item => item.MappingId).Contains(2))
// AND where: an item has 4 as MappingId value
.Where(group => group.Select(item => item.MappingId).Contains(4))
// select Key (= ResourceId) of filtered groups
.Select(group => group.Key);
disclaimer : I don't have full control over the db schema don't judge the data structure or the naming conventions :)
I am doing this large query with multiple joins :
SELECT TOP 30
iss.iss_lKey as IssueId,
iss.iss_sName as IssueName,
con.con_lKey as ContainerId,
con.con_sName as ContainerName,
sto.sto_lKey as StoryId,
sto.sto_sName as StoryName,
sto.sto_Guid as StoryGuid,
sto.sto_sByline as Byline,
sto.sto_created_dWhen as StoryCreatedDate,
sto.sto_deadline_dWhen as StoryDeadline,
sto.sto_lType as StoryType,
sto.sto_sct_lKey as StoryCategory,
sto.sto_created_use_lKey as CreatedBy,
sfv.sfv_tValue as FieldValue,
sf.sfe_lKey as StoryFieldId,
sf.sfe_sCaption as StoryFieldCaption,
sre.sre_lIndex as RevisionIndex
FROM tStory30 sto
JOIN tContainer30 con ON sto.sto_con_lKey = con.con_lKey
JOIN tIssue30 iss ON con.con_iss_lKey = iss.iss_lKey
LEFT OUTER JOIN tStoryRevision30 sre ON sre.sre_sto_lKey = sto.sto_lKey
LEFT OUTER JOIN tStoryField30 sf ON sre.sre_lKey = sf.sfe_sre_lKey
LEFT OUTER JOIN tStoryFieldValue30 sfv ON sfv.sfv_sfe_lKey= sf.sfe_lKey
WHERE sre.sre_lIndex = 0
AND (sto.sto_sName LIKE '%' + #0 + '%'
OR sfv.sfv_tValue LIKE '%' + #0 + '%')";
What I need is really only one row by StoryId, that includes the FieldValue that matched if there was any. I am currently grouping in the code to produce the output, but that prevents me from paging the results.
from r in items
group r by new { r.StoryId, r.ContainerId, r.IssueId }
into storyGroup
select {
storyGroup.Key.StoryId,
storyGroup.Key.ContainerId,
storyGroup.Key.IssueId,
Hits = storyGroup.ToList()
}
Is there any way to achieve this kind of grouping in sql, so that I could then page the result properly (using ROW_NUMBER() OVER)?
Also, I am aware that this is bad practice and should use FullText search. it is planned to setup a solr instance, or use the fulltext options in sqlserver. This is a first attempt to get a smthg going.
EDIT
trying to explain verbally what I try to achieve :
For the context, our app is a cms for magazine editor/publisher.
for a given magazine they have many Issues
each issue has many Container (sort of logical article group)
in each container you have several stories
a story van have 0 or many revisions
the fields of a story are stored by revision (many field per revision)
and a field has a field value.
I need to retrieve the stories that have a given text in the name or in a field value of the first revision (that's the where revisionIndex = 0).
but I also need to retrieve associated data for each story. (issueId, name, containerId and name, and so one..)
the difficult one is probably to retrieve one of the fieldvalue that matched the search. I don't need all of them, just one...
hope this helps!
EDIT Sample data searching for "test". I simplified the columns to make it easier to understand.
Row | IssueId | IssueName | ContainerId | StoryId | FieldValue
1 | 11 IssueName A 394 868 Test Marsupilami bla bla youpi
2 | 40 IssueName B 6 631 story save test
3 | 40 IssueName B 6 666 test story
4 | 4 IssueName c 30 846 test abs
5 | 4 IssueName c 30 846 absc test
6 | 4 IssueName c 30 846 hello test
I am able to get the row number in sqlserver on my query, but here, as you see, I get amultiple times the same story. In this case, I could have simple the following result:
Row | IssueId | IssueName | ContainerId | StoryId | FieldValue
1 | 11 IssueName A 394 868 Test Marsupilami bla bla youpi
2 | 40 IssueName B 6 631 story save test
3 | 4 IssueName c 30 846 test abs
if a story would have test in the story name, then I am ok with a null value in the column FieldValue which field value is selected doesn't matter much.
This is a digression but are you aware that you have converted a left join to an inner join?
LEFT OUTER JOIN tStoryRevision30 sre ON sre.sre_sto_lKey = sto.sto_lKey
LEFT OUTER JOIN tStoryField30 sf ON sre.sre_lKey = sf.sfe_sre_lKey
LEFT OUTER JOIN tStoryFieldValue30 sfv ON sfv.sfv_sfe_lKey= sf.sfe_lKey
WHERE sre.sre_lIndex = 0
try this instead
LEFT OUTER JOIN tStoryRevision30 sre ON sre.sre_sto_lKey = sto.sto_lKey
AND sre.sre_lIndex = 0
LEFT OUTER JOIN tStoryField30 sf ON sre.sre_lKey = sf.sfe_sre_lKey
LEFT OUTER JOIN tStoryFieldValue30 sfv ON sfv.sfv_sfe_lKey= sf.sfe_lKey
(I would have done this in a comment but it is easier to see the code change here.