Find the newest entry of a crosstable per record? - database

I have three tables:
My products with their IDs and their features.
is a table with treatments of my products with a treatment-ID, a method, and a date. The treatments are done in batches of many products so there is a crosstable
with the products IDs and the treatment IDs and a bool value for the success of the treatment.
Each product can undergo many different treatments so there is a many-to-many relation. I now want to add to the product table (1.) for every product a value that shows the method of its most recent successful treatment if there is any.
I made a query that groups the crosstable's entries by product-ID but I don't know how to show the method and date of it's last treatment.
table 1:
| productID | size | weight | height | ... |
|-----------|:----:|-------:|--------|-----|
| 1 | 13 | 16 | 9 | ... |
| 2 | 12 | 17 | 12 | ... |
| 3 | 11 | 15 | 15 | ... |
| ... | ... | ... | ... | ... |
table 2:
| treatmentID | method | date |
|-------------|:--------:|-----------:|
| 1 | dye blue | 01.02.2016 |
| 2 | dye red | 01.02.2017 |
| 3 | dye blue | 01.02.2018 |
| ... | ... | ... |
table 3:
| productID | treatmentID | success |
|-----------|:-----------:|--------:|
| 1 | 1 | yes |
| 1 | 2 | yes |
| 1 | 3 | no |
| ... | ... | ... |
I need table 1 to be like:
table 1:
| productID | size | weight | height | latest succesful method |
|-----------|:----:|-------:|--------|-------------------------|
| 1 | 13 | 16 | 9 | dye red |
| 2 | 12 | 17 | 12 | ... |
| 3 | 11 | 15 | 15 | ... |
| ... | ... | ... | ... | ... |
My query:
SELECT table3.productID, table2.method
FROM table2 INNER JOIN table3 ON table2.treatmentID = table3.treatmentID
GROUP BY table3.productID, table2.method
HAVING (((table3.productID)=Max([table2].[date])))
ORDER BY table3.productID DESC;
but this does NOT show only one (the most recent) entry but all of them.

Simplest solution here would be to write either a subquery within your sql, or create a new query to act as a subquery(it will look like a table) to help indicate(or elminate) the records you want to see.
Using similar but potentially slightly different source data as you only gave one example.
Table1
| ProductID | Size | Weight | Height |
|-----------|------|--------|--------|
| 1 | 13 | 16 | 9 |
| 2 | 12 | 17 | 12 |
| 3 | 11 | 15 | 15 |
Table2
| TreatmentID | Method | Date |
|-------------|------------|----------|
| 1 | dye blue | 1/2/2016 |
| 2 | dye red | 1/2/2017 |
| 3 | dye blue | 1/2/2018 |
| 4 | dye yellow | 1/4/2017 |
| 5 | dye brown | 1/5/2018 |
Table3
| ProductID | TreatmentID | Success |
|-----------|-------------|---------|
| 1 | 1 | yes |
| 1 | 2 | yes |
| 1 | 3 | no |
| 2 | 4 | no |
| 2 | 5 | yes |
First order of business is to get the max(dates) and productIds of successful treatments.
We'll do this by aggregating the date along with the productIDs and "success".
SELECT Table3.productid, Max(Table2.Date) AS MaxOfdate, Table3.success
FROM Table2 INNER JOIN Table3 ON Table2.treatmentid = Table3.treatmentid
GROUP BY Table3.productid, Table3.success;
This should give us something along the lines of:
| ProductID | MaxofDate | Success |
|-----------|-----------|---------|
| 1 | 1/2/2018 | No |
| 1 | 1/2/2017 | Yes |
| 2 | 1/4/2017 | No |
| 2 | 1/8/2017 | Yes |
We'll save this query as a "regular" query. I named mine "max", you should probably use something more descriptive. You'll see "max" in this next query.
Next we'll join tables1-3 together but in addition we will also use this "max" subquery to link tables 1 and 2 by the productID and MaxOfDate to TreatmentDate where success = "yes" to find the details of the most recent SUCCESSFUL treatment.
SELECT table1.productid, table1.size, table1.weight, table1.height, Table2.method
FROM ((table1 INNER JOIN [max] ON table1.productid = max.productid)
INNER JOIN Table2 ON max.MaxOfdate = Table2.date) INNER JOIN Table3 ON
(Table2.treatmentid = Table3.treatmentid) AND (table1.productid = Table3.productid)
WHERE (((max.success)="yes"));
The design will look something like this:
Design
(ps. you can add queries to your design query editor by clicking on the "Queries" tab when you are adding tables to your query design. They act just like tables, just be careful as very detailed queries tend to bog down Access)
Running this query should give us our final results.
| ProductID | Size | Weight | Height | Method |
|-----------|------|--------|--------|-----------|
| 1 | 13 | 16 | 9 | dye red |
| 2 | 12 | 17 | 12 | dye brown |

Related

How can I improve the response time of this query in Oracle

this query takes 24 seconds and returns 1891 results:
SELECT p.STATE, p.REFNUM, p.CODE, p.TYPE, i.STATE, pj.NAME, pj.DOCUMENT
TABLE p
inner join TABLE2 i on i.REFNUM = p.REFNUM
inner join TABLE3 pj on pj.NUMBER = i.NUMBER and p.OFIC_ID = pj.OFIC_ID and p.PUB_ID = pj.PUB_ID
inner join OFICE o on t.OFIC_ID = p.OFIC_ID and o.PUB_ID = p.PUB_ID
inner join GROUP glad on glad.GROUP_CODE=p.GROUP_CODE
WHERE glad.GROUP_TYPE ='3' AND i.STATE = '1'
AND p.PUB_ID IN ('05','11','12','09','08','13','04','02','01','06','10','03','07','14')
AND pj.NAME LIKE 'BANK%'
ORDER BY o.NAME,p.ID;
I have these indexes:
CREATE INDEX IND_TABLE1_REFNUM_ZONE ON TABLE1 (PUB_ID, OFIC_ID, REFNUM, ZONE_ID)
CREATE INDEX IND_TABLE1_REFNUMPUB ON TABLE1 (REFNUM, PUB_ID, OFIC_ID, GROUP_CODE );
CREATE INDEX IND_TABLE1_GROUP ON TABLE1 (PUB_ID, GROUP_CODE, REFNUM, OFIC_ID)
CREATE INDEX IND_TABLE2_REF ON TABLE2 (REFNUM, NUMBER, STATE);
CREATE INDEX IND_TABLE2_QUERY ON TABLE2 (NUMBER, TYPE, STATE, REFNUM, NUM, CODE);
CREATE INDEX IND_TABLE2_REFNUM ON TABLE2 (REFNUM)
CREATE INDEX IND_TABLE2_NUMBER ON TABLE2 (NUMBER)
CREATE INDEX IND_TABLE3_NUM ON TABLE3 (NUMBER, PUB_ID, OFIC_ID, NAME );
CREATE INDEX IND_TABLE3_NAME ON TABLE3 ( NAME );
CREATE INDEX IND_GROUP_COD ON GROUP (GROUP_CODE, GROUP_TYPE)
I made the following queries to see how many records are in each table:
SELECT count(*) FROM TABLE1 --> 18298458 results
SELECT count(*) FROM TABLE2 --> 60627924 results
SELECT count(*) FROM TABLE3 --> 18425913 results
SELECT count(*) FROM OFICE --> 65 results
SELECT count(*) FROM TABLE1 p INNER JOIN GROUP glad on glad.GROUP_CODE=p.GROUP_CODE where glad.GROUP_TYPE ='3' AND p.PUB_ID IN ('05','11','12','09','08','13','04','02','01','06','10','03','07','14') --> 1314077 results
SELECT count(*) FROM TABLE1 p INNER JOIN GROUP glad on glad.GROUP_CODE=p.GROUP_CODE where glad.GROUP_TYPE ='3' AND p.PUB_ID IN ('05') --> 53754 results
SELECT count(*) FROM TABLE3 WHERE NAME LIKE 'BANK%' --> 1922081 results
this is the plan generated by oracle:
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 291K| 38M| 384K| | |
| 1 | SORT ORDER BY | | 291K| 38M| 384K| | |
| 2 | HASH JOIN | | 291K| 38M| 375K| | |
| 3 | TABLE ACCESS FULL | OFICE | 64 | 960 | 3 | | |
| 4 | HASH JOIN | | 291K| 34M| 375K| | |
| 5 | INDEX SKIP SCAN | IND_GROUP_COD | 47 | 329 | 1 | | |
| 6 | HASH JOIN | | 452K| 50M| 375K| | |
| 7 | PART JOIN FILTER CREATE | :BF0000 | 452K| 50M| 375K| | |
| 8 | NESTED LOOPS | | 452K| 50M| 375K| | |
| 9 | NESTED LOOPS | | | | | | |
| 10 | STATISTICS COLLECTOR | | | | | | |
| 11 | HASH JOIN | | 2100K| 166M| 252K| | |
| 12 | NESTED LOOPS | | 2100K| 166M| 252K| | |
| 13 | STATISTICS COLLECTOR | | | | | | |
| 14 | PARTITION RANGE ALL | | 1681K| 89M| 82582 | 1 | 19 |
| 15 | PARTITION HASH ALL | | 1681K| 89M| 82582 | 1 | 32 |
| 16 | TABLE ACCESS FULL | TABLE3 | 1681K| 89M| 82582 | 1 | 608 |
| 17 | INDEX RANGE SCAN | IND_TABLE2_QUERY | 1 | 27 | 103K| | |
| 18 | INDEX FAST FULL SCAN | IND_TABLE2_QUERY | 32M| 845M| 103K| | |
| 19 | INDEX RANGE SCAN | IND_TABLE1_REFNUM_ZONE| | | | | |
| 20 | TABLE ACCESS BY GLOBAL INDEX ROWID| TABLE1 | 1 | 35 | 70380 | ROWID | ROWID |
| 21 | PARTITION RANGE ALL | | 19M| 650M| 70380 | 1 | 19 |
| 22 | PARTITION HASH JOIN-FILTER | | 19M| 650M| 70380 |:BF0000|:BF0000|
| 23 | TABLE ACCESS FULL | TABLE1 | 19M| 650M| 70380 | 1 | 608 |
-----------------------------------------------------------------------------------------------------------------
I think it takes time because this one is using TABLE ACCESS FULL for TABLE1 and TABLE3
if I perform the query filtering only PUB_ID='05' instead of all the numbers in the above query, the query returns 181 results and takes 8 seconds and in that case oracle generates this plan:
--------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 797 | 108K| 312K| | |
| 1 | SORT ORDER BY | | 797 | 108K| 312K| | |
| 2 | NESTED LOOPS | | 797 | 108K| 312K| | |
| 3 | HASH JOIN | | 1238 | 160K| 312K| | |
| 4 | TABLE ACCESS BY INDEX ROWID BATCHED| OFICE | 3 | 45 | 2 | | |
| 5 | INDEX RANGE SCAN | SYS_C0034405 | 3 | | 1 | | |
| 6 | HASH JOIN | | 2091 | 240K| 312K| | |
| 7 | PART JOIN FILTER CREATE | :BF0000 | 66316 | 5375K| 241K| | |
| 8 | NESTED LOOPS | | 66316 | 5375K| 241K| | |
| 9 | PARTITION RANGE ALL | | 53085 | 2903K| 82490 | 1 | 19 |
| 10 | PARTITION HASH ALL | | 53085 | 2903K| 82490 | 1 | 32 |
| 11 | TABLE ACCESS FULL | TABLE3 | 53085 | 2903K| 82490 | 1 | 608 |
| 12 | INDEX RANGE SCAN | IND_TABLE2_QUERY | 1 | 27 | 3 | | |
| 13 | PARTITION RANGE ALL | | 762K| 25M| 68657 | 1 | 19 |
| 14 | PARTITION HASH JOIN-FILTER | | 762K| 25M| 68657 |:BF0000|:BF0000|
| 15 | TABLE ACCESS FULL | TABLE1 | 762K| 25M| 68657 | 1 | 608 |
| 16 | INDEX RANGE SCAN | IND_GROUP_COD | 1 | 7 | 0 | | |
--------------------------------------------------------------------------------------------------------------
SYS_C0034405 is the primary key of OFFICE which contains these fields: (PUB_ID, REG_ID)
if in addition to filtering only PUB_ID='05' I remove the "order by", the query takes only 3.5 seconds but I definitely have to return the ordered data and I would prefer to be able to filter several PUB_IDs
I thought the query could be improved if I removed the "inner join" from GROUP and changed the filter "glad.GROUP_TYPE ='3'" to "p.GROUP_CODE in ('01','07','10','21 ')" (these are all type 3 codes), because now it should use the IND_TABLE1_GROUP index but instead of improving, it gets worse, it takes 13 seconds even filtering only PUB_ID='05'; This is the plan that oracle generates:
------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 47 | 6251 | 172K| | |
| 1 | SORT ORDER BY | | 47 | 6251 | 172K| | |
| 2 | HASH JOIN | | 47 | 6251 | 172K| | |
| 3 | PARTITION RANGE ALL | | 47786 | 2613K| 82490 | 1 | 19 |
| 4 | PARTITION HASH ALL | | 47786 | 2613K| 82490 | 1 | 32 |
| 5 | TABLE ACCESS FULL | TABLE3 | 47786 | 2613K| 82490 | 1 | 608 |
| 6 | HASH JOIN | | 41945 | 3154K| 89633 | | |
| 7 | NESTED LOOPS | | 41945 | 3154K| 89633 | | |
| 8 | NESTED LOOPS | | 75740 | 3154K| 89633 | | |
| 9 | STATISTICS COLLECTOR | | | | | | |
| 10 | NESTED LOOPS | | 18935 | 924K| 15985 | | |
| 11 | TABLE ACCESS BY INDEX ROWID BATCHED | OFICE | 3 | 45 | 2 | | |
| 12 | INDEX RANGE SCAN | SYS_C0034405 | 3 | | 1 | | |
| 13 | INLIST ITERATOR | | | | | | |
| 14 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| TABLE1 | 6312 | 215K| 7039 | ROWID | ROWID |
| 15 | INDEX RANGE SCAN | IND_TABLE1_GROUP | 6828 | | 284 | | |
| 16 | INDEX RANGE SCAN | IND_TABLE2_REFNUM | 4 | | 2 | | |
| 17 | TABLE ACCESS BY INDEX ROWID | TABLE2 | 2 | 54 | 4 | | |
| 18 | INDEX FAST FULL SCAN | IND_TABLE2_QUERY | 2 | 54 | 2 | | |
------------------------------------------------------------------------------------------------------------------------------
And if I put all the PUB_IDs, Oracle generates this plan (it doesn't even use the IND_TABLE1_GROUP index anymore):
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop |
-------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 35661 | 4631K| 345K| | |
| 1 | SORT ORDER BY | | 35661 | 4631K| 345K| | |
| 2 | HASH JOIN | | 35661 | 4631K| 344K| | |
| 3 | TABLE ACCESS FULL | OFICE | 64 | 960 | 3 | | |
| 4 | HASH JOIN | | 35661 | 4109K| 344K| | |
| 5 | PARTITION RANGE ALL | | 1360K| 45M| 79580 | 1 | 19 |
| 6 | PARTITION HASH ALL | | 1360K| 45M| 79580 | 1 | 32 |
| 7 | TABLE ACCESS FULL | TABLE1 | 1360K| 45M| 79580 | 1 | 608 |
| 8 | HASH JOIN | | 2100K| 166M| 252K| | |
| 9 | NESTED LOOPS | | 2100K| 166M| 252K| | |
| 10 | STATISTICS COLLECTOR | | | | | | |
| 11 | PARTITION RANGE ALL | | 1681K| 89M| 82582 | 1 | 19 |
| 12 | PARTITION HASH ALL | | 1681K| 89M| 82582 | 1 | 32 |
| 13 | TABLE ACCESS FULL | TABLE3 | 1681K| 89M| 82582 | 1 | 608 |
| 14 | INDEX RANGE SCAN | IND_TABLE2_QUERY | 1 | 27 | 103K| | |
| 15 | INDEX FAST FULL SCAN | IND_TABLE2_QUERY | 32M| 845M| 103K| | |
-------------------------------------------------------------------------------------------------

Sorting Table in hierarchical order

Is it possible to sorting queries table in hierarchical order like this:
Expected
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| ID | Code | Name | Qty | Amount | is_parent | parent_id | remarks |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 1 | ABC | Parent1 | 2 | 1,000 | 1 | 0 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 4 | FFLK | Product Z | 10 | 2,500 | 0 | 1 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 5 | P6DT | Product 5 | 7 | 1,700 | 0 | 1 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 6 | P2GL | Product T | 5 | 1,100 | 0 | 1 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 2 | DHG | Parent2 | 5 | 1,500 | 1 | 0 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 3 | LMSJ | Product U | 4 | 600 | 0 | 2 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
This is the original data table:
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| ID | Code | Name | Qty | Amount | is_parent | parent_id | remarks |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 1 | ABC | Parent1 | 2 | 1,000 | 1 | 0 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 2 | DHG | Parent2 | 5 | 1,500 | 1 | 0 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 3 | LMSJ | Product U | 4 | 600 | 0 | 2 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 4 | FFLK | Product Z | 10 | 2,500 | 0 | 1 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 5 | P6DT | Product 5 | 7 | 1,700 | 0 | 1 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 6 | P2GL | Product T | 5 | 1,100 | 0 | 1 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
is_parent column = 1 if data row set to parent, 0 if data row set to child
parent_id column = 0 if data row set to parent, depend on ID of parent data
I'm using SQL Server to generate the data.
It looks like the actual question is how to query the data in hierarchical order. This is possible using recursive queries but a faster alternative is to use SQL Server's support for hierarchical data.
A recursive query that returns the data in hierarchical order would look like this :
WITH h AS
(
SELECT
ID,Code,Name,Qty,Amount,is_parent,parent_id,remarks
FROM
dbo.ThatTable
WHERE
parent_id=0
UNION ALL
SELECT
c.ID,c.Code,c.Name,c.Qty,c.Amount,c.is_parent,c.parent_id,c.remarks
FROM
dbo.ThatTable c
INNER JOIN h ON
c.parent_id= h.Id
)
SELECT * FROM h
This query's performance will be acceptable if the ID and Parent_ID fields are indexed, but not great.
Adding a hierarchyid field to the table would make the query simpler and far faster. Assuming there's a hierarchy field, the query would be just :
SELECT *
FROM ThatTable
ORDER BY hierarchy
Adding an index on hierarchy will this query and any query that looks eg for children of a specific node, very fast. Instead of querying recursively, the server only needs to look into that single index.
The article Lesson 1: Converting a Table to a Hierarchical Structure shows how to create a new table with a hierarchyid and populate it from parent/child data.

SQL: Taking Column From a Row Picked By Aggregate Function in View

I have three SQL tables: Release (which represents a release of a movie), Media (which represents the individual pieces of recordable media in those releases; i.e. for Blu-ray/DVD combos, there will be two rows in Media, one Blu-ray and one DVD, that point back to the same row in Release) and MediaType (which defines Blu-ray, DVD, VHS, etc.). There's a one-to-many relationship for Release/Media and MediaType/Media, with Media being on the "many" side of both relationships. I have a view for Release, vRelease, which contains aggregate functions, such as a COUNT that shows how many media are associated with that release. This is what I have for this view so far:
SELECT dbo.Release.ReleaseID
,dbo.Release.Name
,CASE WHEN Release.Compilation = 0 THEN 'No' WHEN Release.Compilation = 1 THEN 'Yes' END AS Compilation
,dbo.Release.Owner
,CASE WHEN Release.LentOut = 0 THEN 'No' WHEN Release.LentOut = 1 THEN 'Yes' END AS LentOut
,COUNT(dbo.Media.ReleaseID) AS NumberOfMedia
,MIN(dbo.Media.MediaID) AS FirstMediaID
,MIN(dbo.MediaType.Name) AS FirstMediaType
FROM dbo.MediaType INNER JOIN
dbo.Media ON dbo.MediaType.MediaTypeID = dbo.Media.MediaTypeID RIGHT OUTER JOIN
dbo.Release ON dbo.Media.ReleaseID = dbo.Release.ReleaseID
GROUP BY dbo.Release.ReleaseID, dbo.Release.Name, dbo.Release.Compilation, dbo.Release.Owner, dbo.Release.LentOut
You'll notice that I've also included two other aggregate columns: FirstMediaID grabs the ID of the media associated with that release that appears first in the Media table (i.e. if a release has two DVDs associated with it, it gets one with the lower ID value). This column on its own isn't useful; what I want to do is then, in turn, get the MediaType that that Media is associated with. In other words, I want a column that shows the MediaType of the first Media that is attached to each Release. The column after that, FirstMediaType, is supposed to do that, but it instead gets the MediaType among all of the Media associated with the Release and picks the one that is alphabetically first - which means that Blu-ray will always be prioritized over DVD (which is fine), but Audio CD will always be prioritized over everything else (which is not fine).
How do I get the FirstMediaType column in this view to get the MediaType of the Media identified in FirstMediaID?
UPDATE: Here are the tables, their columns and some sample rows.
A couple from Release:
+-----------+----------------------------------------+-------+-------------+---------+
| ReleaseID | Name | Owner | Compilation | LentOut |
+-----------+----------------------------------------+-------+-------------+---------+
| 2 | Alice in Wonderland | NULL | 0 | 0 |
| 6 | 4 Film Favorites - Family Comedies | NULL | 1 | 0 |
| 8 | Aladdin | NULL | 0 | 0 |
| 463 | Harry Potter and the Half-Blood Prince | NULL | 0 | 1 |
| 534 | Spirited Away | Ryan | 0 | 0 |
| 571 | The Original Christmas Classics | NULL | 1 | 0 |
+-----------+----------------------------------------+-------+-------------+---------+
Compilation indicates a release that has more than one movie in it.
Corresponding entries in Media:
+---------+-------------+-------------------------------------------------------------------------------------+-----------+
| MediaID | MediaTypeID | Name | ReleaseID |
+---------+-------------+-------------------------------------------------------------------------------------+-----------+
| 2 | 2 | Movie | 2 |
| 3 | 1 | Movie | 2 |
| 12 | 1 | Space Jam; Looney Tunes: Back in Action | 6 |
| 13 | 1 | Funky Monkey; Osmosis Jones | 6 |
| 17 | 3 | Movie | 8 |
| 620 | 1 | Movie | 463 |
| 726 | 1 | Movie | 534 |
| 807 | 1 | Rudolph the Red-Nosed Reindeer; Cricket on the Hearth | 571 |
| 808 | 1 | Frosty the Snowman; Frosty Returns | 571 |
| 809 | 1 | Santa Claus is Comin' to Town!; Mr. Magoo's Christmas Carol; The Little Drummer Boy | 571 |
| 810 | 4 | Tracks 1-7 | 571 |
+---------+-------------+-------------------------------------------------------------------------------------+-----------+
First few in MediaType:
+-------------+--------------+
| MediaTypeID | Name |
+-------------+--------------+
| 1 | DVD Disc |
| 2 | Blu-ray Disc |
| 3 | VHS |
| 4 | Audio CD |
+-------------+--------------+
The corresponding entries in vRelease SHOULD be this:
+-----------+----------------------------------------+-------------+-------+---------+---------------+--------------+----------------+
| ReleaseID | Name | Compilation | Owner | LentOut | NumberOfMedia | FirstMediaID | FirstMediaType |
+-----------+----------------------------------------+-------------+-------+---------+---------------+--------------+----------------+
| 2 | Alice in Wonderland | No | NULL | No | 2 | 2 | Blu-ray Disc |
| 6 | 4 Film Favorites - Family Comedies | Yes | NULL | No | 2 | 12 | DVD Disc |
| 8 | Aladdin | No | NULL | No | 1 | 17 | VHS |
| 463 | Harry Potter and the Half-Blood Prince | No | NULL | Yes | 1 | 620 | DVD Disc |
| 534 | Spirited Away | No | Ryan | No | 1 | 726 | DVD Disc |
| 571 | The Original Christmas Classics | Yes | NULL | No | 4 | 807 | DVD Disc |
+-----------+----------------------------------------+-------------+-------+---------+---------------+--------------+----------------+
But it's actually this:
+-----------+----------------------------------------+-------------+-------+---------+---------------+--------------+----------------+
| ReleaseID | Name | Compilation | Owner | LentOut | NumberOfMedia | FirstMediaID | FirstMediaType |
+-----------+----------------------------------------+-------------+-------+---------+---------------+--------------+----------------+
| 2 | Alice in Wonderland | No | NULL | No | 2 | 2 | Blu-ray Disc |
| 6 | 4 Film Favorites - Family Comedies | Yes | NULL | No | 2 | 12 | DVD Disc |
| 8 | Aladdin | No | NULL | No | 1 | 17 | VHS |
| 463 | Harry Potter and the Half-Blood Prince | No | NULL | Yes | 1 | 620 | DVD Disc |
| 534 | Spirited Away | No | Ryan | No | 1 | 726 | DVD Disc |
| 571 | The Original Christmas Classics | Yes | NULL | No | 4 | 807 | Audio CD |
+-----------+----------------------------------------+-------------+-------+---------+---------------+--------------+----------------+
It's that last one that's the problem.
I ended up finding a simple way to do what I wanted. It isn't as fancy as Used_By_Already's answer (which did end up working, as far as I could tell) and probably breaks a SQL Best Practices rule somewhere, but it's much easier to understand and maintain - at least for my newbie brain.
Since the problem was trying to get the view to use an aggregate column it calculated in a join, I just split the two-step action over two views. vReleasePre has all of the columns I outlined in my original query except for FirstMediaType. vRelease now simply takes all of the columns from vReleasePre and adds FirstMediaType, which takes its value from a join at the end: LEFT OUTER JOIN dbo.vMedia ON dbo.vReleasePre.FirstMediaID = dbo.vMedia.MediaID, where vMedia is a view with all the columns from Media, plus the MediaType column (I already had vMedia lying around).
Since this database is being used in an ASP.NET MVC web application via Entity Framework, and EF has been pretty strange about what it will and won't accept into the data model, I figure that a simple, if roundabout, solution is probably going to be my best option.
vReleasePre:
SELECT dbo.Release.ReleaseID
,dbo.Release.Name
,CASE WHEN Release.Compilation = 0 THEN 'No' WHEN Release.Compilation = 1 THEN 'Yes' END AS Compilation
,dbo.Release.Owner
,CASE WHEN Release.LentOut = 0 THEN 'No' WHEN Release.LentOut = 1 THEN 'Yes' END AS LentOut
,COUNT(dbo.Media.ReleaseID) AS NumberOfMedia
,MIN(dbo.Media.MediaID) AS FirstMediaID
FROM dbo.MediaType INNER JOIN
dbo.Media ON dbo.MediaType.MediaTypeID = dbo.Media.MediaTypeID RIGHT OUTER JOIN
dbo.Release ON dbo.Media.ReleaseID = dbo.Release.ReleaseID
GROUP BY dbo.Release.ReleaseID, dbo.Release.Name, dbo.Release.Compilation, dbo.Release.Owner, dbo.Release.LentOut
vRelease:
SELECT dbo.vReleasePre.ReleaseID
,dbo.vReleasePre.Name
,dbo.vReleasePre.Compilation
,dbo.vReleasePre.Owner
,dbo.vReleasePre.LentOut
,dbo.vReleasePre.NumberOfMedia
,dbo.vMedia.MediaType
FROM dbo.vReleasePre LEFT OUTER JOIN
dbo.vMedia ON dbo.vReleasePre.FirstMediaID = dbo.vMedia.MediaID
A very convenient technique that returns whole rows associated with needs such as "First", "Last", "Earliest", "Latest" is to use row_number() over(). Here you want the "first media type", so it is relevant here.
As you will see in the following query joining the [Media] table is replaced with a subquery that includes a row number calculation. Here we partition by ReleaseID and order by MediaID, so, for each ReleaseID the first row will be the one with the lowest MediaID value. Then in the join to this derived table an extra condition is added to only consider rows with a row number of 1.
Proposed Query
SELECT
r.ReleaseID
, m.MediaID
, mt.MediaTypeID
, mt.name MediaName
, r.Name
, CASE
WHEN r.Compilation = 0 THEN 'No'
WHEN r.Compilation = 1 THEN 'Yes'
END AS compilation
, r.Owner
, CASE
WHEN r.LentOut = 0 THEN 'No'
WHEN r.LentOut = 1 THEN 'Yes'
END AS lentout
FROM dbo.Release r
INNER JOIN (
SELECT
Media.*
, ROW_NUMBER() OVER(PARTITION BY ReleaseID
ORDER BY MediaID) AS rn
FROM dbo.Media
) m ON r.ReleaseID = m.ReleaseID and rn = 1
INNER JOIN dbo.MediaType mt ON m.MediaTypeID = mt.MediaTypeID
Result
| ReleaseID | MediaID | MediaTypeID | MediaName | Name | compilation | Owner | lentout |
|-----------|---------|-------------|--------------|----------------------------------------|-------------|--------|---------|
| 2 | 2 | 2 | Blu-ray Disc | Alice in Wonderland | No | (null) | No |
| 6 | 12 | 1 | DVD Disc | 4 Film Favorites - Family Comedies | Yes | (null) | No |
| 8 | 17 | 3 | VHS | Aladdin | No | (null) | No |
| 463 | 620 | 1 | DVD Disc | Harry Potter and the Half-Blood Prince | No | (null) | Yes |
| 534 | 726 | 1 | DVD Disc | Spirited Away | No | Ryan | No |
| 571 | 807 | 1 | DVD Disc | The Original Christmas Classics | Yes | (null) | No |
Demo available at SQLFiddle
The easiest way would be to add another join to your MediaType table on FirstMediaId = MediaType.MediaId
;WITH data AS (
SELECT dbo.Release.ReleaseID
,dbo.Release.Name
,CASE WHEN Release.Compilation = 0 THEN 'No' WHEN Release.Compilation = 1 THEN 'Yes' END AS Compilation
,dbo.Release.Owner
,CASE WHEN Release.LentOut = 0 THEN 'No' WHEN Release.LentOut = 1 THEN 'Yes' END AS LentOut
,COUNT(dbo.Media.ReleaseID) AS NumberOfMedia
,MIN(dbo.Media.MediaID) AS FirstMediaID
FROM dbo.MediaType
INNER JOIN dbo.Media
ON dbo.MediaType.MediaTypeID = dbo.Media.MediaTypeID
RIGHT OUTER JOIN dbo.Release
ON dbo.Media.ReleaseID = dbo.Release.ReleaseID
GROUP BY dbo.Release.ReleaseID, dbo.Release.Name, dbo.Release.Compilation, dbo.Release.Owner, dbo.Release.LentOut
)
SELECT data.ReleaseId
,data.Name
,data.Compilation
,data.Owner
,data.LentOut
,data.NumberOfMedia
,data.FirstMediaId
,MediaType.Name as FirstMediaName
FROM data
LEFT OUTER JOIN dbo.MediaType
ON data.FirstMediaId = MediaType.MediaTypeId
for the newbie brain, this is the subquery I used
SELECT
ROW_NUMBER() OVER(PARTITION BY ReleaseID
ORDER BY MediaID) AS rn
, Media.*
FROM dbo.Media
and this is what it does (see the rn column)
| rn | MediaID | MediaTypeID | Name | ReleaseID |
|----|---------|-------------|-------------------------------------------------------------------------------------|-----------|
| 1 | 2 | 2 | Movie | 2 |
| 2 | 3 | 1 | Movie | 2 |
| 1 | 12 | 1 | Space Jam; Looney Tunes: Back in Action | 6 |
| 2 | 13 | 1 | Funky Monkey; Osmosis Jones | 6 |
| 1 | 17 | 3 | Movie | 8 |
| 1 | 620 | 1 | Movie | 463 |
| 1 | 726 | 1 | Movie | 534 |
| 1 | 807 | 1 | Rudolph the Red-Nosed Reindeer; Cricket on the Hearth | 571 |
| 2 | 808 | 1 | Frosty the Snowman; Frosty Returns | 571 |
| 3 | 809 | 1 | Santa Claus is Comin' to Town!; Mr. Magoo's Christmas Carol; The Little Drummer Boy | 571 |
| 4 | 810 | 4 | Tracks 1-7 | 571 |
Now keep only those rows with 1 in the rn column:
| rn | MediaID | MediaTypeID | Name | ReleaseID |
|----|---------|-------------|-------------------------------------------------------|-----------|
| 1 | 2 | 2 | Movie | 2 |
| 1 | 12 | 1 | Space Jam; Looney Tunes: Back in Action | 6 |
| 1 | 17 | 3 | Movie | 8 |
| 1 | 620 | 1 | Movie | 463 |
| 1 | 726 | 1 | Movie | 534 |
| 1 | 807 | 1 | Rudolph the Red-Nosed Reindeer; Cricket on the Hearth | 571 |
Then join just those rows to Releases and MediaType
Bingo
= the wanted result.
Not hard, really not hard. You really will want to learn about those window functions because they can solve heaps of problems.

Join 2 tables by matching children

I'm trying to have 2 tables (In this case it's actually 1 table in a self join) joined by their matching children.
Let me preface the purpose of this which might give a better understanding what I need:
I'm trying to look up a new order that I just got, to see if we ever had the same order, in order to find out in which box type this would be packaged.
So i'd need the matching order to contain the same item and the same qty for the item.
Look at the tables below and note that order 1300981 has the same items as order 1303097, how do I write this join?
Remember: I don't want the results to include any matches that do not match %100.
SQL Fiddle
OrderMain:
| OrderID | BoxId |
|---------|--------|
| 1300981 | 34 |
| 1303096 | (null) |
| 1303097 | (null) |
| 1303098 | (null) |
| 1303099 | (null) |
| 1303100 | (null) |
| 1303101 | (null) |
| 1303102 | (null) |
| 1303103 | (null) |
| 1303104 | B1 |
| 1303105 | (null) |
| 1303106 | (null) |
| 1303107 | 48 |
| 1303108 | (null) |
| 1303109 | (null) |
| 1303110 | (null) |
| 1303111 | (null) |
| 1303112 | (null) |
| 1303113 | (null) |
| 1303114 | (null) |
| 1303115 | (null) |
| 1303116 | (null) |
| 1303117 | (null) |
Order Detail:
| id | OrderID | Item | Qty |
|----|---------|--------|-----|
| 1 | 1300981 | 172263 | 3 |
| 2 | 1300981 | 171345 | 3 |
| 3 | 1300981 | 138757 | 3 |
| 4 | 1303117 | 231711 | 1 |
| 5 | 1303116 | 227835 | 1 |
| 6 | 1303115 | 244798 | 1 |
| 7 | 1303114 | 121755 | 1 |
| 8 | 1303113 | 145275 | 2 |
| 9 | 1303112 | 219554 | 1 |
| 10 | 1303111 | 179385 | 1 |
| 11 | 1303110 | 6229 | 1 |
| 12 | 1303109 | 217330 | 1 |
| 13 | 1303108 | 243596 | 1 |
| 14 | 1303107 | 246758 | 1 |
| 15 | 1303106 | 193931 | 1 |
| 16 | 1303105 | 244659 | 1 |
| 17 | 1303104 | 192548 | 1 |
| 18 | 1303103 | 228410 | 1 |
| 19 | 1303102 | 147474 | 1 |
| 20 | 1303101 | 239191 | 1 |
| 21 | 1303100 | 243594 | 1 |
| 22 | 1303099 | 232301 | 1 |
| 23 | 1303098 | 201212 | 1 |
| 24 | 1303097 | 172263 | 3 |
| 25 | 1303097 | 171345 | 3 |
| 26 | 1303097 | 138757 | 3 |
| 27 | 1303096 | 172263 | 3 |
| 28 | 1303096 | 171345 | 1 |
| 29 | 1303096 | 138757 | 3 |
| 30 | 1303095 | 172263 | 3 |
Expected Results
| OrderID | BoxId |
|---------|--------|
| 1303097 | 34 |
May be a weird way to do this, but if you convert the order details to xml and compare it to other orders, you can look for matches.
WITH BoxOrders AS
(
SELECT om.[OrderId],
om.[BoxId],
(SELECT Item, Qty
FROM orderDetails od
WHERE od.[OrderId] = om.[OrderId]
ORDER BY Item
FOR XML PATH('')) Details
FROM orderMain om
WHERE BoxID IS NOT NULL
)
SELECT mo.OrderId, bo.BoxId
FROM BoxOrders bo
JOIN (
SELECT om.[OrderId],
om.[BoxId],
(SELECT Item, Qty
FROM orderDetails od
WHERE od.[OrderId] = om.[OrderId]
ORDER BY Item
FOR XML PATH('')) Details
FROM orderMain om
WHERE BoxID IS NULL
) mo ON bo.Details = mo.Details
SQL Fiddle
Here's a different approach using SQL and a few analytics.
This joins order detail to itself based on item and qty and order number < other order number and ensures the count of items in each order matches. Thus if items match, count matches and qty matches then the order has the same items.
This returns both orders but easily enough to adjust. Using the CTE so the count materializes. Pretty sure you can't use a having with an analytic like this.
The one major assumption I'm making is that order numbers are sequential and when you say see if an older order exists, I should only need to look at earlier order numbers when evaluating if a prior order had the same items and quantities.
I'm also assuming a 100% match means: Exact same items. Same Quantity of items. and SAME Item Count so count of items for order 1 is 3 and order 2 is 3 and items and quantities match that is 100% but if order 2 had 4 items and order 1 only had 3, no match.
with cte as (
SELECT distinct OD1.OrderID PriorOrder, od2.orderID newOrder, OM.BoxId,
count(OD1.Item) over (partition by OD1.OrderID) OD1Cnt,
count(OD2.Item) over (partition by OD2.OrderID) OD2cnt
FROM OrderDetails OD1
INNER JOIN orderDetails OD2
on OD1.item=OD2.item
and od1.qty = od2.qty
and OD1.OrderID < OD2.OrderID
LEFT JOIN ORderMain OM
on OM.OrderID = OD1.orderID)
Select PriorOrder, NewOrder, boxID from cte where od1cnt = od2cnt

joining sql views with alternate values instead of nulls

hopefully this chart makes sense...
the problem is that I have many columns in the many-many table, how can I get all the column values in a view, without doing ISNULL for each row?
(sql server 10.5)
ITEM
+------+
| ID |
|------|
| 1 |
| 2 |
| 3 |
+------+
LANGUAGE
+-------+---------+
| ID | Name |
|-------+---------|
| 1 | English |
| 2 | French |
+-------+---------+
Item Names
+----------+---------+------------+------------+
| ItemID | LangId | Name | Color |
|----------+---------+------------+------------|
| 1 | 1 | apple | red |
| 1 | 2 | pomme | rouge |
| 2 | 1 | orange | orange |
| 3 | 1 | bannana | yellow |
+----------+---------+------------+------------+
desired view
+----------+---------+------------+------------+
| ItemID | LangId | Name | Color |
|----------+---------+------------+------------|
| 1 | 1 | apple | red |
| 1 | 2 | pomme | rouge |
| 2 | 1 | orange | orange |
| 2 | 2 | orange | orange | <--- added automatically
| 3 | 1 | bannana | yellow |
| 3 | 2 | bannana | yellow | <--- added automatically
+----------+---------+------------+------------+
because I'm trying to create a view there are certain limitations:
The columns being modified in the view must directly reference the underlying data in the table columns. The columns cannot be derived in any other way, such as through the following:
An aggregate function: AVG, COUNT, SUM, MIN, MAX, GROUPING, STDEV, STDEVP, VAR, and VARP.
A computation. The column cannot be computed from an expression that uses other columns. Columns that are formed by using the set operators UNION, UNION ALL, CROSSJOIN, EXCEPT, and INTERSECT amount to a computation and are also not updatable.
I an however create multiple views, which is how I've gotten around some of these restrictions before. we can assume that I already have this table
intermediary view:
+----------+---------+------------+------------+
| ItemID | LangId | Name | Color |
|----------+---------+------------+------------|
| 1 | 1 | apple | red |
| 1 | 2 | pomme | rouge |
| 2 | 1 | orange | orange |
| 3 | 1 | bannana | yellow |
+----------+---------+------------+------------+
as well as:
+----------+---------+------------+------------+
| ItemID | LangId | Name | Color |
|----------+---------+------------+------------|
| 1 | 1 | apple | red |
| 1 | 2 | pomme | rouge |
| 2 | 1 | orange | orange |
| 2 | 2 | - | - |
| 3 | 1 | bannana | yellow |
| 3 | 2 | - | - |
+----------+---------+------------+------------+
these are some of the views:
view1 - all combinations
view2 - all combinations with languages
the corresponding SQL:
SELECT dbo.view1.ItemID, dbo.view1.LanguageID, dbo.ItemLanguages.Name, dbo.ItemLanguages.Color
FROM dbo.ItemLanguages RIGHT OUTER JOIN
dbo.view1 ON dbo.ItemLanguages.LanguageID = dbo.view1.LanguageID AND dbo.ItemLanguages.ItemID = dbo.view1.ItemID
result of view 2
here is the test database with the views and tables: http://pastebin.com/4BpBSmHY
One way I've been able to do it is using ISNULL
SELECT dbo.view1.ItemID,
dbo.view1.LanguageID,
ISNULL(dbo.ItemLanguages.Name,
(SELECT TOP (1) Name
FROM dbo.ItemLanguages AS x
WHERE (ItemID = dbo.view1.ItemID))) AS Name,
ISNULL(dbo.ItemLanguages.Color,
(SELECT TOP (1) Color
FROM dbo.ItemLanguages AS x
WHERE (ItemID = dbo.view1.ItemID))) AS Color,
CASE
WHEN dbo.ItemLanguages.ItemID is NULL THEN 1
ELSE 0
END as valid
FROM dbo.ItemLanguages RIGHT OUTER JOIN
dbo.view1 ON dbo.ItemLanguages.ItemID = dbo.view1.ItemID
AND dbo.ItemLanguages.LanguageID = dbo.view1.LanguageID
the reason why I don't like this approach is that I'm doing this across many more columns. I have the notion that doing multiple SELECTS would slow down the result drastically.
I though I'd be able to just check if the row exists like with the CASE
WHEN dbo.ItemLanguages.ItemID is NULL THEN 1
ELSE 0
END as valid
and then call the select once and populate all the columns
.. in a view

Resources