I have difficulty joining two tables that look like the following:
The main table PMEOBJECT which has a unique key named OBJECTID and
has in total 12768 rows.
Then I want to join PMEOBJECTVALIDITY on it which has an n:1 relationship with PMEOBJECT, since it has more rows,
because it saves the changes over time of PMEOBJECT (i.e. when a certain object is not
valid anymore), this one has 12789 rows (meaning only 21 objects
changed over time). However, I only want to have the current last
VALIDFROM date shown in the query. This all works fine.
Then the trouble starts when I want to join PMEOBJECTDIMENSION, which has an
n:1 relationship with PMEOBJECTVALIDITY and has 36737 rows in total.
SELECT
PMEOBJECT.OBJECTID
,PMEOBJECTVALIDITY.VALIDFROM
,PMEOBJECTDIMENSION.DIMENSION2_
FROM PMEOBJECT
LEFT JOIN PMEOBJECTVALIDITY
ON PMEOBJECTVALIDITY.OBJECTID = PMEOBJECT.OBJECTID
AND PMEOBJECTVALIDITY.DATAAREAID = PMEOBJECT.DATAAREAID
INNER JOIN(
SELECT
OBJECTID,
MAX(VALIDFROM) AS NEWFROMDATE,
MAX(VALIDTO) AS NEWTODATE
FROM PMEOBJECTVALIDITY B
GROUP BY OBJECTID
) B
ON PMEOBJECTVALIDITY.OBJECTID = B.OBJECTID
AND PMEOBJECTVALIDITY.VALIDFROM = B.NEWFROMDATE
LEFT JOIN PMEOBJECTDIMENSION
ON PMEOBJECTDIMENSION.OBJECTVALIDITYID = PMEOBJECTVALIDITY.RECID
AND PMEOBJECTDIMENSION.DATAAREAID = PMEOBJECTVALIDITY.DATAAREAID
INNER JOIN(
SELECT
OBJECTVALIDITYID,
MAX(VALIDFROM) AS NEWFROMDATE_2
FROM PMEOBJECTDIMENSION C
GROUP BY OBJECTVALIDITYID
) C
ON PMEOBJECTDIMENSION.OBJECTVALIDITYID = C.OBJECTVALIDITYID
AND PMEOBJECTDIMENSION.VALIDFROM = C.NEWFROMDATE_2
Results in query per step:
SELECT PMEOBJECT: 12768 rows
LEFT JOIN PMEVALIDITY: 12789 rows
INNER JOIN PMEVALIDITY: 12768 rows
LEFT JOIN PMEOBJECTDIMENSION: 36737 rows
INNER JOIN PMEOBJECTDIMENSION: 12729 rows
I want the end result again to have the same 12768 rows, I don't want any ObjectId to be left out.
What am I missing here?
Kind regards,
Igor
Following might help:
from PMEOBJECTDIMENSION onwards:
LEFT JOIN (SELECT PMEOBJECTDIMENSION.OBJECTVALIDITYID, PMEOBJECTDIMENSION.DATAAREAID
FROM PMEOBJECTDIMENSION
INNER JOIN(SELECT OBJECTVALIDITYID, MAX(VALIDFROM) AS NEWFROMDATE_2
FROM PMEOBJECTDIMENSION C
GROUP BY OBJECTVALIDITYID
) C
ON PMEOBJECTDIMENSION.OBJECTVALIDITYID = C.OBJECTVALIDITYID
AND PMEOBJECTDIMENSION.VALIDFROM = C.NEWFROMDATE_2
)X
ON X.OBJECTVALIDITYID = PMEOBJECTVALIDITY.RECID
AND X.DATAAREAID = PMEOBJECTVALIDITY.DATAAREAID
and select the distinct records if duplicates present.
The INNER JOINs are filtering out records- what you want is that the LEFT JOIN table (PMEOBJECTVALIDITY and PMEOBJECTDIMENSION) should only include records that have at least a match on the INNER JOIN queries (alias B and C). You can accomplish this with by nesting the INNER JOIN with the LEFT JOIN, generally done as follows:
SELECT *
FROM A
LEFT JOIN B
INNER JOIN C
ON B.ID = C.BID
ON A.ID = B.AID
Now B is INNER JOINed on C and will only contain records that have a match in C, but will preserve the LEFT JOIN not remove any records from A.
In your case, you can simply move the ON clause from the LEFT JOIN to the end of the following INNER JOIN.
SELECT
PMEOBJECT.OBJECTID
,PMEOBJECTVALIDITY.VALIDFROM
,PMEOBJECTDIMENSION.DIMENSION2_
FROM PMEOBJECT
LEFT JOIN PMEOBJECTVALIDITY
INNER JOIN(
SELECT
OBJECTID,
MAX(VALIDFROM) AS NEWFROMDATE,
MAX(VALIDTO) AS NEWTODATE
FROM PMEOBJECTVALIDITY B
GROUP BY OBJECTID
) B
ON PMEOBJECTVALIDITY.OBJECTID = B.OBJECTID
AND PMEOBJECTVALIDITY.VALIDFROM = B.NEWFROMDATE
ON PMEOBJECTVALIDITY.OBJECTID = PMEOBJECT.OBJECTID
AND PMEOBJECTVALIDITY.DATAAREAID = PMEOBJECT.DATAAREAID --here it is!
LEFT JOIN PMEOBJECTDIMENSION
INNER JOIN(
SELECT
OBJECTVALIDITYID,
MAX(VALIDFROM) AS NEWFROMDATE_2
FROM PMEOBJECTDIMENSION C
GROUP BY OBJECTVALIDITYID
) C
ON PMEOBJECTDIMENSION.OBJECTVALIDITYID = C.OBJECTVALIDITYID
AND PMEOBJECTDIMENSION.VALIDFROM = C.NEWFROMDATE_2
ON PMEOBJECTDIMENSION.OBJECTVALIDITYID = PMEOBJECTVALIDITY.RECID
AND PMEOBJECTDIMENSION.DATAAREAID = PMEOBJECTVALIDITY.DATAAREAID --I'm here
Related
I have a query and a diag_code is either in one table (UM_SERVICE) or the other (LOS), but I can't join both tables to get diag_code that isn't null, that I can think of. Does this look ok for finding if diag_code is in one of the tables and lookup table? It's possible to have both LOS and UM_SERVICE have a diag code on different rows, and they could be different, and both or one could be in the lookup table. I'm not seeing anything in internet search.
Here's a simplified stored procedure:
SELECT distinct
c.id
,uc.id
,c.person_id
FROM dbo.CASES c
INNER JOIN dbo.UM_CASE uc with (NOLOCK) ON uc.case_id = c.id
LEFT JOIN dbo.UM_SERVICE sv (NOLOCK) ON sv.case_id = omc.case_id
LEFT JOIN dbo.UM_SERVICE_CERT usc on usc.service_id = sv.id
LEFT JOIN dbo.LOS S WITH (NOLOCK) ON S.case_id = UC.case_id
LEFT JOIN dbo.LOS_EXTENSION SC WITH (NOLOCK) ON SC.los_id = S.id
INNER JOIN dbo.PERSON op with (NOLOCK) on op.id = c.Person_id
WHERE
(sv.diag_code is not null and c.case_id = sv.case_id
or
s.diag_code is not null and c.case_id = s.case_id)
and
(sv.diag_code is not null and sv.diag_code in (select diag_code from TABLE_LOOKUP)
or
s.diag_code is not null and s.diag_code in (select diag_code from TABLE_LOOKUP)
Table setups like this:
CASES
id person_id
UM_CASE
case_id
LOS
case_id id
LOS_EXTENSION
los_id
Person
id cid
UM_SERVICE
case_id diag_code
UM_SERVICE_CERT
service_id id
TABLE_LOOKUP
diag_code
Since you have two different searches being run, it is going to be much easier to write/read by writing the searches individually and then bringing your two results sets together using the UNION operator. The UNION will eliminate duplicates across the two result sets in a similar manner to what your usage of SELECT DISTINCT is doing for a single result set.
Like so:
/*first part of union performs seach using filter on dbo.UM_SERVICE*/
SELECT
c.id
,uc.id
,c.person_id
FROM
dbo.CASES AS c
INNER JOIN dbo.UM_CASE AS uc ON uc.case_id=c.id
LEFT JOIN dbo.UM_SERVICE AS sv ON sv.case_id = omc.case_id
LEFT JOIN dbo.UM_SERVICE_CERT AS usc on usc.service_id=sv.id
LEFT JOIN dbo.LOS AS S ON S.case_id = UC.case_id
LEFT JOIN dbo.LOS_EXTENSION AS SC ON SC.los_id= S.id
INNER JOIN dbo.PERSON AS op on op.id=c.Person_id
WHERE
sv.diag_code in (select diag_code from TABLE_LOOKUP) /*will eliminate null values in sv.diag_code*/
UNION /*deduplicate result sets*/
/*second part of union performs search using filter on dbo.LOS*/
SELECT
c.id
,uc.id
,c.person_id
FROM
dbo.CASES AS c
INNER JOIN dbo.UM_CASE AS uc ON uc.case_id=c.id
LEFT JOIN dbo.UM_SERVICE AS sv ON sv.case_id = omc.case_id
LEFT JOIN dbo.UM_SERVICE_CERT AS usc on usc.service_id=sv.id
LEFT JOIN dbo.LOS AS S ON S.case_id = UC.case_id
LEFT JOIN dbo.LOS_EXTENSION AS SC ON SC.los_id= S.id
INNER JOIN dbo.PERSON AS op on op.id=c.Person_id
WHERE
s.diag_code in (select diag_code from TABLE_LOOKUP); /*will eliminate null values in s.diag_code*/
In my current data structure my OFFENSE table doesnt have all the columns that I need. For this reason I inner join it with several other table columns and insert it into a new table called TexasCCHPublicRecords created by me. This is my query:
INSERT INTO TexasCCHPublicRecords (OFF_IDN, TRS_IDN, AGY_TXT, DOO_DTE, AON_COD, AOL_TXT, LDA_CODE, GOC_COD, ADN_COD, ADD_TXT, ADA_DTE, REF_TXT,
IPN_NBR, ICA_NBR, DMV_COD, TRS_CODE, TRN_CODE, PERSON_ID, FIRST_NAME, LAST_NAME, DATE_OF_BIRTH, CDN, OffenseCode, OffenseName, CDNCode, ArrestingAgency,
ArrestingAgencyORI, ProsecutionAgency, ProsecutionAgencyORI)
SELECT o.* , trs.TRS_COD as 'TRS_CODE', trn.TRN_NBR as 'TRN_CODE', p.PER_IDN as 'PERSON_ID', nam.FNA_TXT as 'FIRST_NAME', nam.LNA_TXT as 'LAST_NAME',
birth.DOB_DTE as 'DATE_OF_BIRTH', cdnCode.CDN_VAL_TXT as 'CDN', offenseCode.OFF_COD as 'OffenseCode', offenseCode.LIT_TXT as 'OffenseName',
cdnCode.CDN_VAL_COD as 'CDNCode', arrestingAgency.ATR_TXT as 'ArrestingAgency', arrestingAgency.ORI_TXT as 'ArrestingAgencyORI',
prosecutionAgency.ATR_TXT as 'ProsecutionAgency',prosecutionAgency.ORI_TXT as 'ProsecutionAgencyORI'
FROM OFFENSE o
inner join CCH_PUBLIC.dbo.TRS trs on trs.TRS_IDN = o.TRS_IDN
inner join CCH_PUBLIC.dbo.TRN trn on trn.TRN_IDN = trs.TRN_IDN
inner join CCH_PUBLIC.dbo.PERSON p on p.IND_IDN = trn .IND_IDN
inner join CCH_PUBLIC.dbo.NAME nam on nam.PER_IDN = p.PER_IDN
inner join CCH_PUBLIC.dbo.BRTHDATE birth on birth.PER_IDN = p.PER_IDN
inner join CCH_PUBLIC.dbo.PROSECUTION prose on prose.TRS_IDN = o.TRS_IDN
inner join CCH_PUBLIC.dbo.AGENCY arrestingAgency on arrestingAgency.ORI_TXT = o.AGY_TXT
inner join CCH_PUBLIC.dbo.AGENCY prosecutionAgency on prosecutionAgency.ORI_TXT = o.REF_TXT
inner join CCH_PUBLIC.dbo.CRT_STAT crtStat on crtStat.TRS_IDN = o.TRS_IDN
inner join CCH_PUBLIC.dbo.CDN_COD cdnCode on cdnCode.CDN_VAL_COD = crtStat.CDN_COD
inner join CCH_PUBLIC.dbo.OFF_CODE offenseCode on offenseCode.OFF_COD = o.AON_COD
I would conclude that this table would select every offense in the OFFENSE table inner join it with the other tables and insert it into the TexasCCHPublicRecords
However after running this simple count
select count(*) as 'Offense Table Record Count' FROM OFFENSE
select count(*) as 'CCHPublicRecords Table Record Count' FROM TexasCCHPublicRecords
I end with these results:
Offense Table Record Count
11372377
CCHPublicRecords Table Record Count
49666836
There are 38 million more records in my new table. How did this happen? Is my query inserting repeated instances of the same offense row?
My goal is to SELECT an OFFENSE inner join it with the tables that are associated with it and INSERT that into my empty table. What am i doing wrong?
UPDATE
After reading one of the comments I realized that one of the tables was giving me 7 columns. Now I am trying to ONLY select the FIRST result. I am trying to do a CROSS APPLY however I am not sure what to place on the outside of the cross apply. This is my select Query:
SELECT o.*, trs.TRS_COD as 'TRS_CODE', trn.TRN_NBR as 'TRN_CODE', p.PER_IDN as 'PERSON_ID', nam.FNA_TXT as 'FIRST_NAME', nam.LNA_TXT as 'LAST_NAME',
birth.DOB_DTE as 'DATE_OF_BIRTH', cdnCode.CDN_VAL_TXT as 'CDN', offenseCode.OFF_COD as 'OffenseCode', offenseCode.LIT_TXT as 'OffenseName',
cdnCode.CDN_VAL_COD as 'CDNCode', arrestingAgency.ATR_TXT as 'ArrestingAgency', arrestingAgency.ORI_TXT as 'ArrestingAgencyORI',
prosecutionAgency.ATR_TXT as 'ProsecutionAgency',prosecutionAgency.ORI_TXT as 'ProsecutionAgencyORI'
FROM OFFENSE o
inner join CCH_PUBLIC.dbo.TRS trs on trs.TRS_IDN = o.TRS_IDN
inner join CCH_PUBLIC.dbo.TRN trn on trn.TRN_IDN = trs.TRN_IDN
inner join CCH_PUBLIC.dbo.PERSON p on p.IND_IDN = trn.IND_IDN
inner join CCH_PUBLIC.dbo.BRTHDATE birth on birth.PER_IDN = p.PER_IDN
inner join CCH_PUBLIC.dbo.PROSECUTION prose on prose.TRS_IDN = o.TRS_IDN
inner join CCH_PUBLIC.dbo.AGENCY arrestingAgency on arrestingAgency.ORI_TXT = o.AGY_TXT
inner join CCH_PUBLIC.dbo.AGENCY prosecutionAgency on prosecutionAgency.ORI_TXT = o.REF_TXT
inner join CCH_PUBLIC.dbo.CRT_STAT crtStat on crtStat.TRS_IDN = o.TRS_IDN
inner join CCH_PUBLIC.dbo.CDN_COD cdnCode on cdnCode.CDN_VAL_COD = crtStat.CDN_COD
inner join CCH_PUBLIC.dbo.OFF_CODE offenseCode on offenseCode.OFF_COD = o.AON_COD
CROSS APPLY (
SELECT TOP 1 *
FROM CCH_PUBLIC.dbo.NAME as nam
WHERE nam.PER_IDN = p.PER_IDN
) nam
is the nam out of the CROSS APPLY correct?
CREATE PROCEDURE spJoin3Tables
AS
BEGIN
SELECT
tbl_Jobs.JobTitle, tbl_Company.CompName
FROM
tbl_Jobs
INNER JOIN
tbl_Company ON tbl_Jobs.CompID = tbl_Company.ID
SELECT
tbl_Cities.CityName
FROM
tbl_Cities
INNER JOIN
tbl_JobCities ON tbl_Cities.ID = tbl_JobCities.CityID
INNER JOIN
tbl_Jobs ON tbl_JobCities.JobID = tbl_Jobs.ID
END
The result is two tables. I want to get all three columns in one table - what will be the query?
You just need to add the company table and the columns from the first query to the second query and make sure to join on the company id.
SELECT
tbl_Cities.CityName, tbl_Jobs.JobTitle, tbl_Company.CompName
FROM
tbl_Cities
INNER JOIN
tbl_JobCities ON tbl_Cities.ID = tbl_JobCities.CityID
INNER JOIN
tbl_Jobs ON tbl_JobCities.JobID = tbl_Jobs.ID
INNER JOIN
tbl_Company ON tbl_Jobs.CompID = tbl_Company.ID
USING INNER JOIN U CAN GET ALL DATE. IF IN CASE ANY TABLE IN ID COLUMNS NULL VALUE THEN USER LEFT JOIN
SELECT tbl_Jobs.JobTitle, tbl_Company.CompName , tbl_Cities.CityName
FROM tbl_Jobs
INNER JOIN tbl_Company ON tbl_Jobs.CompID = tbl_Company.ID
INNER JOIN tbl_JobCities ON tbl_JobCities.JobID = tbl_Jobs.ID
INNER JOIN tbl_Cities ON tbl_Cities.ID = tbl_JobCities.CityID
I have a sql query
select distinct
Process.ReportLogProcessID as [Process.ReportLogProcessID],
Process.ProcessTitle as [Process.ProcessTitle],
CAST(User0.PrimaryEmail AS nvarchar(max)) as [Process_Contacts.IsPrimaryContact]
from
ReportProcess as Process inner join
ReportProcessContact as ReportProcessContact0 on
((ReportProcessContact0.SessionID = Process.SessionID)) left outer join
[User] as User0 on
((ReportProcessContact0.ReferenceID = User0.UserID)) left outer join
[Group] as Group0 on
((ReportProcessContact0.ReferenceID = Group0.GroupID and
ReportProcessContact0.ReferenceType = 2))
order by
[Process.ProcessTitle] asc
and it give below result
if you see it returns two rows with the same 'Process Title'-'Testing123' is there any way I can distinct this in spite of whatever be the 'Process Contact'.
Is there any way to distinct the result on the base of particular column?
In your query [Process_Contacts.IsPrimaryContact] column have different emails for both records. If you remove that column then only it will distinct records.
UPDATE : You can try like :
select
Process.ReportLogProcessID as [Process.ReportLogProcessID],
Process.ProcessTitle as [Process.ProcessTitle],
MAX(CAST(User0.PrimaryEmail AS nvarchar(max))) as [Process_Contacts.IsPrimaryContact]
from ReportProcess as Process
inner join ReportProcessContact as ReportProcessContact0 on
((ReportProcessContact0.SessionID = Process.SessionID))
left outer join [User] as User0 on
((ReportProcessContact0.ReferenceID = User0.UserID))
left outer join [Group] as Group0 on
((ReportProcessContact0.ReferenceID = Group0.GroupID and
ReportProcessContact0.ReferenceType = 2))
group by Process.ReportLogProcessID, Process.ProcessTitle
order by [Process.ProcessTitle] asc
As a schematic example, I have 3 tables that I desire to join, A,B,C where A to B is joined via an outer join and B to C is potentially joined via an inner join. In this constellation, I have to write two outer joins to get data if the first join does not have a match A-B in a line:
SELECT [fields] FROM
A
LEFT OUTER JOIN
B ON [a.field]=[b.field]
LEFT OUTER JOIN
C ON [b.field]=[c.field]
It seems to me logically that I have to write the second statement as an outer join. However I'm curious if there is a possiblity to set brackets for the join scope to signal that the second join should only used if the first inner join has found matching data for A-B. Something like:
SELECT [fields] FROM
A
(LEFT OUTER JOIN
B ON [a.field]=[b.field]
INNER JOIN
C ON [b.field]=[c.field]
)
I have played around a little but not found a possiblity to set brackets. The only way I have found to make this working is with a sub-query. Is this the only way to go?
Actually there is a syntax for that case.
SELECT fields
FROM A
LEFT OUTER JOIN (
B INNER JOIN C ON b.field = c.field
) ON a.field = b.field
The parentheses are optional, and the result is the same without them, being equivalent to the result of
SELECT fields
FROM A
LEFT OUTER JOIN B ON a.field = b.field
LEFT OUTER JOIN C ON b.field = c.field
You could perform it as follows
SELECT Fields
FROM TableA a
LEFT OUTER JOIN (SELECT Fields
FROM TableB b
INNER JOIN TableC c ON b.Field = c.Field) x on a.Field = x.Fi
eld
Not too sure if there would be any performance benefit to this without testing it out though.
The 2nd way would be with a subquery as per Jon Bridges' answer
However, they are the same semantically.
A CTE could be used if you have a complex subquery
;WITH BjoinC AS
(
SELECT Fields
FROM TableB b
INNER JOIN
TableC c ON b.Field = c.Field
)
SELECT [fields] FROM
A
LEFT OUTER JOIN
BjoinC ON ...
What About something like:
SELECT [fields]
FROM A
LEFT JOIN ( SELECT DISTINCT [fields]
FROM B
LEFT JOIN C ON b.field = c.field
) on a.field = b.field