Grouping results to get unique rows after multiple joins - sql-server

disclaimer : I don't have full control over the db schema don't judge the data structure or the naming conventions :)
I am doing this large query with multiple joins :
SELECT TOP 30
iss.iss_lKey as IssueId,
iss.iss_sName as IssueName,
con.con_lKey as ContainerId,
con.con_sName as ContainerName,
sto.sto_lKey as StoryId,
sto.sto_sName as StoryName,
sto.sto_Guid as StoryGuid,
sto.sto_sByline as Byline,
sto.sto_created_dWhen as StoryCreatedDate,
sto.sto_deadline_dWhen as StoryDeadline,
sto.sto_lType as StoryType,
sto.sto_sct_lKey as StoryCategory,
sto.sto_created_use_lKey as CreatedBy,
sfv.sfv_tValue as FieldValue,
sf.sfe_lKey as StoryFieldId,
sf.sfe_sCaption as StoryFieldCaption,
sre.sre_lIndex as RevisionIndex
FROM tStory30 sto
JOIN tContainer30 con ON sto.sto_con_lKey = con.con_lKey
JOIN tIssue30 iss ON con.con_iss_lKey = iss.iss_lKey
LEFT OUTER JOIN tStoryRevision30 sre ON sre.sre_sto_lKey = sto.sto_lKey
LEFT OUTER JOIN tStoryField30 sf ON sre.sre_lKey = sf.sfe_sre_lKey
LEFT OUTER JOIN tStoryFieldValue30 sfv ON sfv.sfv_sfe_lKey= sf.sfe_lKey
WHERE sre.sre_lIndex = 0
AND (sto.sto_sName LIKE '%' + #0 + '%'
OR sfv.sfv_tValue LIKE '%' + #0 + '%')";
What I need is really only one row by StoryId, that includes the FieldValue that matched if there was any. I am currently grouping in the code to produce the output, but that prevents me from paging the results.
from r in items
group r by new { r.StoryId, r.ContainerId, r.IssueId }
into storyGroup
select {
storyGroup.Key.StoryId,
storyGroup.Key.ContainerId,
storyGroup.Key.IssueId,
Hits = storyGroup.ToList()
}
Is there any way to achieve this kind of grouping in sql, so that I could then page the result properly (using ROW_NUMBER() OVER)?
Also, I am aware that this is bad practice and should use FullText search. it is planned to setup a solr instance, or use the fulltext options in sqlserver. This is a first attempt to get a smthg going.
EDIT
trying to explain verbally what I try to achieve :
For the context, our app is a cms for magazine editor/publisher.
for a given magazine they have many Issues
each issue has many Container (sort of logical article group)
in each container you have several stories
a story van have 0 or many revisions
the fields of a story are stored by revision (many field per revision)
and a field has a field value.
I need to retrieve the stories that have a given text in the name or in a field value of the first revision (that's the where revisionIndex = 0).
but I also need to retrieve associated data for each story. (issueId, name, containerId and name, and so one..)
the difficult one is probably to retrieve one of the fieldvalue that matched the search. I don't need all of them, just one...
hope this helps!
EDIT Sample data searching for "test". I simplified the columns to make it easier to understand.
Row | IssueId | IssueName | ContainerId | StoryId | FieldValue
1 | 11 IssueName A 394 868 Test Marsupilami bla bla youpi
2 | 40 IssueName B 6 631 story save test
3 | 40 IssueName B 6 666 test story
4 | 4 IssueName c 30 846 test abs
5 | 4 IssueName c 30 846 absc test
6 | 4 IssueName c 30 846 hello test
I am able to get the row number in sqlserver on my query, but here, as you see, I get amultiple times the same story. In this case, I could have simple the following result:
Row | IssueId | IssueName | ContainerId | StoryId | FieldValue
1 | 11 IssueName A 394 868 Test Marsupilami bla bla youpi
2 | 40 IssueName B 6 631 story save test
3 | 4 IssueName c 30 846 test abs
if a story would have test in the story name, then I am ok with a null value in the column FieldValue which field value is selected doesn't matter much.

This is a digression but are you aware that you have converted a left join to an inner join?
LEFT OUTER JOIN tStoryRevision30 sre ON sre.sre_sto_lKey = sto.sto_lKey
LEFT OUTER JOIN tStoryField30 sf ON sre.sre_lKey = sf.sfe_sre_lKey
LEFT OUTER JOIN tStoryFieldValue30 sfv ON sfv.sfv_sfe_lKey= sf.sfe_lKey
WHERE sre.sre_lIndex = 0
try this instead
LEFT OUTER JOIN tStoryRevision30 sre ON sre.sre_sto_lKey = sto.sto_lKey
AND sre.sre_lIndex = 0
LEFT OUTER JOIN tStoryField30 sf ON sre.sre_lKey = sf.sfe_sre_lKey
LEFT OUTER JOIN tStoryFieldValue30 sfv ON sfv.sfv_sfe_lKey= sf.sfe_lKey
(I would have done this in a comment but it is easier to see the code change here.

Related

Adding multiple records from a string

I have a string of email addresses. For example, "a#a.com; b#a.com; c#a.com"
My database is:
record | flag1 | flag2 | emailaddresss
--------------------------------------------------------
1 | 0 | 0 | a#a.com
2 | 0 | 0 | b#a.com
3 | 0 | 0 | c#a.com
What I need to do is parse the string, and if the address is not in the database, add it.
Then, return a string of just the record numbers that correspond to the email addresses.
So, if the call is made with "A#a.com; c#a.com; d#a.com", the rountine would add "d#a.com", then return "1, 3,4" corresponding to the records that match the email addresses.
What I am doing now is calling the database once per email address to look it up and confirm it exists (adding if it doesn't exist), then looping thru them again to get the addresses 1 by 1 from my powershell app to collect the record numbers.
There has to be a way to just pass all of the addresses to SQL at the same time, right?
I have it working in powershell.. but slowly..
I'd love a response from SQL as shown above of just the record number for each email address in a single response. That is, "1,2,4" etc.
My powershell code is:
$EmailList2 = $EmailList.split(";")
# lets get the ID # for each eamil address.
foreach($x in $EmailList2)
{
$data = exec-query "select Record from emailaddresses where emailAddress = #email" -parameter #{email=$x.trim()} -conn $connection
if ($($data.Tables.record) -gt 0)
{
$ResponseNumbers = $ResponseNumbers + "$($data.Tables.record), "
}
}
$ResponseNumbers = $($ResponseNumbers+"XX").replace(", XX","")
return $ResponseNumbers
You'd have to do this in 2 steps. Firstly INSERT the new values and then use a SELECT to get the values back. This answer uses delimitedsplit8k (not delimitedsplit8k_LEAD) as you're still using SQL Server 2008. On the note of 2008 I strongly suggest looking at upgrade paths soon as you have about 6 weeks of support left.
You can use the function to split the values and then INSERT/SELECT appropriately:
DECLARE #Emails varchar(8000) = 'a#a.com;b#a.com;c#a.com';
WITH Emails AS(
SELECT DS.Item AS Email
FROM dbo.DelimitedSplit8K(#Emails,';') DS)
INSERT INTO YT (emailaddress) --I don't know what the other columns value should be, so have excluded
SELECT E.Email
FROM dbo.YourTable YT
LEFT JOIN Emails E ON YT.emailaddress = E.Email
WHERE E.Email IS NULL;
SELECT YT.record
FROM dbo.YourTable YT
JOIN dbo.DelimitedSplit8K(#Emails,';') DS ON DS.Item = YT.emailaddress;

SQL Query Parent Child Full Path from table

I have a table listing the parent child relationship for each element like this:
ParentID ParentTitle ChildId ChildTitle
----------------------------------------------
843 Documents 38737 Jobs
843 Documents 52537 Tools
843 Documents 5763 SecondOps
843 Documents 4651 Materials
38737 Jobs 16619 Job001
38737 Jobs 16620 Job002
38737 Jobs 16621 Job003
38737 Jobs 16622 Job004
38737 Jobs 16623 Job005
52537 Tools 1952 HandTools
52537 Tools 1953 Automated
52537 Tools 1957 Custom
1952 HandTools 12 Cordless10mm
1952 HandTools 13 Cordless8mm
1952 HandTools 14 CableCrimp
1952 HandTools 15 Cutter
1952 HandTools 16 EdgePlane
5763 SecondOps 101 Procedure001
5763 SecondOps 102 Procedure002
5763 SecondOps 103 Procedure003
4651 Materials 33576 Raw
4651 Materials 33577 Mixed
4651 Materials 33578 Hybrid
4651 Materials 33579 Custom
16622 Job004 101 Procedure001
16622 Job004 14 CableCrimp
16622 Job004 15 Cutter
16622 Job004 4651 Mixed
16623 Job005 102 Procedure002
16623 Job005 103 Procedure003
16623 Job005 16619 Job001
16623 Job005 1953 Automated
16623 Job005 33579 Custom
16623 Job005 33576 Raw
I would like to get the full path of each Combination using the IDs, for example
Documents\Jobs\Job003 = 843\38737\16621
Another example would be "Procedure001" which is listed in 2 places
Documents\SecondOps\Procedure001 = 843\5763\101
The same document is also referenced here:
Documents\Jobs\Job004\Procedure001 = 843\38737\16622\101
I'd like to take this table and build a TreeView in .NET. So having the full path for each item would make it a cake walk.
Otherwise, I was thinking that I could start at the Root page and keep recursing through the parents, building a child list, then recursing those, etc.
Is there a better way to query this to build those paths? This list has 400,000 records so if there is a more efficient way it would save time
This was all originally in an AS400 system DB until 2000ish then made into a MediaWiki site. I am pulling the data via the api with the intent of building an interface for a SQL Server database.
I can do basic SQL queries, joins, unions, etc.
Let me know what other info I can provide if this isn't clear
You could use INNER JOIN and LEFT JOIN if you are using SQL SERVER MS, and here are how the query look like, which will give you the full result (combination) based on your requirement:
SELECT A.ParentTitle + '\'+B.ParentTitle+
CASE WHEN C.ParentTitle IS NOT NULL THEN '\' +C.ParentTitle
ELSE ''
END
+
' =' + A.ParentID + '\'+B.ParentID+
CASE WHEN C.ParentID IS NOT NULL THEN '\' +C.ParentID
ELSE ''
END
FROM TABLE AS A
INNER JOIN TABLE AS B
ON B.ParentID = A.ChildId
LEFT JOIN TABLE AS C
ON C.ParentID = B.ChildId
Not 100% sure whether it will work as I expected or not, please give it a try xD
A tree structure means Recursion for a generic solution.
Pls, don't try this in sql. Just take datarow from sql into a list or something like and make populate with recursion in a programming language.
Your tree class wil be like :
public class MyObj {
public int Id {get; set;}
public string Title {get; set;}
public List<MyObj> {get; set; } = null;
}
0.You table its pretty wrong. The corect way will be :
CREATE TABLE Jobs(
Id int not null primary key,
Title nvarchar(255) not null,
StartTime datetime,--optional maybe will help
ParentId int null --can be null root will have no parent
)
But I will try to explain on your table how it's done.
I will suppose that you have some kind datacontext (DBML,EDMX etc.)
Find root or roots. In your case root will those nr that are on ParentID and are not on the ChildId.
Query that will list your roots:
SELECT DISTINCT a.ParentId FROM
YourTable a LEFT JOIN
YourTable b ON a.ParentId=b.ChildId
WHERE b.ParentId is null
Make a recursive procedure that will retrive your data in a class structure as above(MyObj).
procedure MyObj GetTree(int id, db){
if (db.YourTable.Any(r => r.ParentId==Id)){
var q=db.YourTable.Select(r => r.ParentId==Id).ToList();
var result = new MyObj{
Id = q.ParentId,
Title = q.ParentTitle,
Children = new List<MyObj>()
}
foreach( var el in q) {
if (db.YourTable.Any(r => r.ParentId==el.ChildId))
result.Children.Add(GetTree(el.ChildId,db))
else
result.Children.Add( new MyObj{
Id = el.ChildId,
Title = el.ChildTitle,
Children = null
});
return result;
}
}
return null;
}
make trees with list Id from point 1 stored in a list let's say ListIds you will do something like that:
List finaltrees = new List()
Ids.ForEach(id => finaltrees.Add(GetTree(id,dbcontext));
Now you have a tree structure in finaltrees.
PS:
I wrote the code directly in browser (C#),there can be some typos error.
So to elaborate on what I am trying to do, I'm working with a wiki version that doesn't use namespaces to establish document paths.
For example if a page is 3 levels deep on a document tree like this
RootPage
Page01
Page02
Page03
Page04
Using the Namespace approach Page03's Name(Path) is "RootPage:Page01:Page02:Page03"
I would Like to do the same thing with the PageIDs
So given this example you would have
PageTitle PageId Path
RootPage 001 001
Page01 101 001:101
Page02 201 001:101:201
Page03 301 001:101:201:301
Page04 302 001:101:201:302
So now All I have to do is Put the PagePath together.
There are several challenges to Consider with this wiki
No 2 documents can have the same TITLE
Document IDs are basically
irrelevant, but handy in this case(at least in the version I am
working on)
Thankfully there is a list of Pages and their "Links" or
Child Pages. I believe you would call it a MANY to MANY
The Key Point to remember is even if a page is listed as a child of many other pages, Only one really exists and I only need one of them in the results.
So Using LONG's example here is where I've gotten to
Using this Table:
CREATE Table [dbo].[ExampleTable](
[RecordID] Int IDENTITY (1, 1) Not NULL,
[ParentID] Int Not NULL,
[ParentTitle] VARCHAR(800) NULL,
[ChildID] Int Not NULL,
[ChildTitle] VARCHAR(800) NULL,
PRIMARY KEY CLUSTERED ([RecordID] ASC),);
This Data:
INSERT INTO [dbo].[ExampleTable]
([ParentID]
,[ParentTitle]
,[ChildID]
,[ChildTitle])
VALUES
(843,'Documents',38737,'Jobs'),
(843,'Documents',52537,'Tools'),
(843,'Documents',5763,'SecondOps'),
(843,'Documents',4651,'Materials'),
(38737,'Jobs',16619,'Job001'),
(38737,'Jobs',16620,'Job002'),
(38737,'Jobs',16621,'Job003'),
(38737,'Jobs',16622,'Job004'),
(38737,'Jobs',16623,'Job005'),
(52537,'Tools',1952,'HandTools'),
(52537,'Tools',1953,'Automated'),
(52537,'Tools',1957,'Custom'),
(1952,'HandTools',12,'Cordless10mm'),
(1952,'HandTools',13,'Cordless8mm'),
(1952,'HandTools',14,'CableCrimp'),
(1952,'HandTools',15,'Cutter'),
(1952,'HandTools',16,'EdgePlane'),
(5763,'SecondOps',101,'Procedure001'),
(5763,'SecondOps',102,'Procedure002'),
(5763,'SecondOps',103,'Procedure003'),
(4651,'Materials',33576,'Raw'),
(4651,'Materials',33577,'Mixed'),
(4651,'Materials',33578,'Hybrid'),
(4651,'Materials',33579,'Custom'),
(16622,'Job004',101,'Procedure001'),
(16622,'Job004',14,'CableCrimp'),
(16622,'Job004',15,'Cutter'),
(16622,'Job004',4651,'Mixed'),
(16623,'Job005',102,'Procedure002'),
(16623,'Job005',103,'Procedure003'),
(16623,'Job005',16619,'Job001'),
(16623,'Job005',1953,'Automated'),
(16623,'Job005',33579,'Custom'),
(16623,'Job005',33576,'Raw')
GO
And This Query, Which I modified from LONG's example:
SELECT DISTINCT C.ChildTitle as PageTitle, convert(varchar(20),A.ParentID) + ':' + convert(varchar(20),B.ParentID) +
CASE WHEN C.ParentID IS NOT NULL THEN ':' + convert(varchar(20),C.ParentID)
ELSE ''
END
+
CASE WHEN C.ChildID IS NOT NULL THEN ':' + convert(varchar(20),C.ChildID)
ELSE ''
END
FROM ExampleTable AS A
INNER JOIN ExampleTable AS B
ON B.ParentID = A.ChildId
LEFT JOIN ExampleTable AS C
ON C.ParentID = B.ChildId
ORDER By PageTitle
I get These Results:
PageTitle UnNamed
NULL 16622:4651
NULL 38737:16622
NULL 38737:16623
NULL 52537:1952
NULL 843:38737
NULL 843:4651
NULL 843:52537
NULL 843:5763
Automated 843:38737:16623:1953
CableCrimp 843:38737:16622:14
CableCrimp 843:52537:1952:14
Cordless10mm 843:52537:1952:12
Cordless8mm 843:52537:1952:13
Custom 38737:16622:4651:33579
Custom 843:38737:16623:33579
Cutter 843:38737:16622:15
Cutter 843:52537:1952:15
EdgePlane 843:52537:1952:16
Hybrid 38737:16622:4651:33578
Job001 843:38737:16623:16619
Mixed 38737:16622:4651:33577
Mixed 843:38737:16622:4651
Procedure001 843:38737:16622:101
Procedure002 843:38737:16623:102
Procedure003 843:38737:16623:103
Raw 38737:16622:4651:33576
Raw 843:38737:16623:33576
What I'd like to get is a SINGLE occurance of each page, Regarless of which Parent it happens to be found
Then I can use these Paths to turn the Virtual Tree Structure into an actual Tree Structure.
The Last Issue is that the actual Link List is VERY similar to the example I created, except that it has 400,000 records.
When I run this query against the actual "Link List" it runs for about 17 minutes and runs out of memory.
I've been researching the MAXRECURSION option, but I am still working on it, don't know if that is problem or not.

ibm 1 sql creating - need to add 2 tables

I have following view which is working but not sure how to add 2 tables to join.
This table is adres1 and it will join on the IDENT# and IDSFX# to table
prodta.adres1 called adent# and adsfx#, there I need a col. ads15.
then i also need to get the ship to, row in this adres1. this we get first from the order table, prodta. oeord1 in col. odgrc#. This grc# is 11 pos and is combined 8 and 3 of the ent and suf. these 2 represent the ship to record and looking in same table adres1 (we do have many logical views on them if it's easier, like adres15) we can get col. ADSTTC for the ship to state.
Not sure if can included these 2 new parts to the current view created code below. Please ask if something not clear, it's an old system and somewhat developed convoluted.
CREATE VIEW Prolib.SHPWEIGHTP AS SELECT
T01.IDORD#,
T01.IDDOCD,
T01.IDPRT#,
t01.idsfx#,
T01.IDSHP#,
T01.IDNTU$,
T01.IDENT#,
(T01.IDNTU$ * T01.IDSHP#) AS LINTOT,
T02.IAPTWT,
T02.IARCC3,
T02.IAPRLC,
T03.PHVIAC,
T03.PHORD#,
PHSFX#,
T01.IDORDT,
T01.IDHCD3
FROM PRODTA.OEINDLID T01
INNER JOIN PRODTA.ICPRTMIA T02 ON T01.IDPRT# = T02.IAPRT#
INNER JOIN
(SELECT DISTINCT
PHORD#,
PHSFX#,
PHVIAC,
PHWGHT
FROM proccdta.pshippf) AS T03 ON t01.idord# = T03.phord#
WHERE T01.IDHCD3 IN ('MDL','TRP')
I'm not exactly clear on what you're asking, and it looks like some of the column-names are missing from your description, but this should get you pretty close:
CREATE VIEW Prolib.SHPWEIGHTP AS
SELECT T01.IDORD#,
T01.IDDOCD,
T01.IDPRT#,
t01.idsfx#,
T01.IDSHP#,
T01.IDNTU$,
T01.IDENT#,
( T01.IDNTU$ * T01.IDSHP# ) AS LINTOT ,
T02.IAPTWT,
T02.IARCC3,
T02.IAPRLC,
T03.PHVIAC,
T03.PHORD#,PHSFX#,
T01.IDORDT,
T01.IDHCD3,
t04.ads15
FROM PRODTA.OEINDLID T01
INNER JOIN PRODTA.ICPRTMIA T02
ON T01.IDPRT# = T02.IAPRT#
INNER JOIN (SELECT DISTINCT
PHORD#,
PHSFX#,
PHVIAC,
PHWGHT
FROM proccdta.pshippf) AS T03
ON t01.idord# = T03.phord#
JOIN prodta.adres1 as t04
on t04.adent# = t01.adent#
and t04.adsfx# = t01.adsfx#
JOIN prodta.oeord1 t05
on t05.odgrc# = T01.IDENT# || T01.SUFFIX
WHERE T01.IDHCD3 IN ('MDL','TRP')
Let me know if you need more details.
HTH !

Modeling a non-primary Key relationship

I am trying to model the following relationship with the intent of designing classes for EF code first.
Program table:
ProgramID - PK
ProgramName
ClusterCode
Sample data
ProgramID ProgramName ClusterCode
--------------------------------------
1 Spring A
2 Fall A
3 Winter B
4 Summer B
Cluster table:
ID
ClusterCode
ClusterDetails
Sample data:
ID ClusterCode ClusterDetails
---------------------------------
1 A 10
2 A 20
3 A 30
4 B 20
5 B 40
I need to join the Program table to the Cluster table so I can get the list of cluster details for each program.
The SQL would be
Select
from Programs P
Join Cluster C On P.ClusterCode = C.ClusterCode
Where P.ProgramID = 'xxx'
Note that for the Program table, ClusteCode is not unique.
For Cluster table, neither ClusterCode nor ClusterDetail is unique.
How would I model this so I can take advantage of navigation properties and code-first?
assuming you have mapped above two tables and make an association between them and you are using C#, you can use a simple join :
List<Sting> clustedDets=new ArrayList<String>();
var q =
from p in ClusterTable
join c in Program on p equals c.ClusterTable
select new { p.ClusterDetails };
foreach (var v in q)
{
clustedDets.Add(v.ClusterDetails);
}

How to separate content from my query into 3 distinct columns

I have got the following query, which puts the data that I need (text in the 245, 260 and 300 tags of my table called "bib") into a single column (called "Title"). Instead I would like to have this data separated into three distinct columns, one for each tag. Any suggestions would be very welcome
SELECT DISTINCT isbnEX_inverted.isbn, ISNULL(bib.text, CONVERT(varchar(255), bib_longtext.longtext)) AS Title, top_circ_summary.ranking, bib.tag, bib.bib#
FROM bib INNER JOIN
item ON bib.bib# = item.bib# INNER JOIN
isbnEX_inverted ON bib.bib# = isbnEX_inverted.bib# INNER JOIN
top_circ_summary ON item.bib# = top_circ_summary.bib# LEFT OUTER JOIN
bib_longtext ON bib.bib# = bib_longtext.bib# AND bib.tag = bib_longtext.tag
WHERE (isbnEX_inverted.isbn LIKE '%978__________%') AND (top_circ_summary.collection_group = 'jfic') AND (bib.tag in ('245', '520', '300'))
order by top_circ_summary.ranking
Here's a sample of the (tab separated) output
isbn Title ranking tag bib# 9780143307334 a217 p. :bill. ;c21
cm. 1 300 962366 9780143307334 aDiary of a wimpy kid :bthe third
wheel /cby Jeff Kinney. 1 245 962366 9780143307334 aTrying to find
a partner for the Valentine's Day dance, Greg finds solace in the fact
that his best friend Rowley also doesn't have a date, but an
unexpected twist might turn his night around. 1 520 962366

Resources