SQL Subquery Select Table based on Outer Query - sql-server

I have a general query that looks like this:
SELECT DISTINCT pb.id, pb.last, pb.first, pb.middle, pb.sex, pb.phone, pb.type,
specialties = substring(
SELECT ('|' + cs.specialty )
FROM CertSpecialty AS cs
INNER JOIN CertSpecialtyIndex AS csi on cs.specialty = csi.specialty
WHERE cs.id = pb.id
ORDER BY cs.sequence_no
FOR XML path(''),2,500)
FROM table AS pb
WHERE etc etc etc
The issue is this:
The "type" column that I'm selecting is an integer - types 1-4.
In the subquery, see where I am querying from the table CertSpecialty right now.
What I actually need to do is, if the type field comes back as a 1 or a 3, that's the table I need to query. But if the row's result is a type 2 or 4 (i.e., an ELSE), I need to be querying the same column in the table CertSpecialtyOther.
So it'd need to look something like this (though this obv doesn't work):
SELECT DISTINCT pb.id, pb.last, pb.first, pb.middle, pb.sex, pb.phone, pb.type,
specialties =
IF type in (1,3)
substring((SELECT ('|' + cs.specialty )
FROM CertSpecialty AS cs
INNER JOIN CertSpecialtyIndex AS csi on cs.specialty = csi.specialty
WHERE cs.id = pb.id
ORDER BY cs.sequence_no
FOR XML path(''),2,500)
ELSE
substring((SELECT ('|' + cs.specialty )
FROM CertSpecialtyOther AS cs
INNER JOIN CertSpecialtyIndex AS csi on cs.specialty = csi.specialty
WHERE cs.id = pb.id
ORDER BY cs.sequence_no
FOR XML path(''),2,500)
end
FROM table AS pb
WHERE etc etc etc
Is this possible? If so, what is the correct syntax? Is there a simpler way to write it where I'm switching which table I query without completely duplicating the subquery?
Also, does anyone have a good resource they could link me for this sort of thing to learn more besides?
Thanks in advance.

Use a CTE.
;WITH cs AS
(
SELECT 'A' SpecialtyCategory, phy_key, specialty
FROM CertSpecialty
UNION ALL
SELECT 'B' SpecialtyCategory, phy_key, specialty
FROM CertSpecialtyOther
)
SELECT csi.id, cs.specialty
FROM cs
INNER JOIN CertSpecialtyIndex AS csi on cs.specialty = csi.specialty
WHERE cs.phy_key = pb.phy_key
AND cs.SpecialtyCategory = (CASE WHEN type in (1,3) THEN 'A' ELSE 'B' END)

Related

SQL combine two queries result into one dataset

I am trying to combine two SQL queries the first is
SELECT
EAC.Person.FirstName,
EAC.Person.Id,
EAC.Person.LastName,
EAC.Person.EmployeeId,
EAC.Person.IsDeleted,
Controller.Cards.SiteCode,
Controller.Cards.CardCode,
Controller.Cards.ActivationDate,
Controller.Cards.ExpirationDate,
Controller.Cards.Status,
EAC.[Group].Name
FROM
EAC.Person
INNER JOIN
Controller.Cards ON EAC.Person.Id = Controller.Cards.PersonId
INNER JOIN
EAC.GroupPersonMap ON EAC.Person.Id = EAC.GroupPersonMap.PersonId
INNER JOIN
EAC.[Group] ON EAC.GroupPersonMap.GroupId = EAC.[Group].Id
And the second one is
SELECT
IsActive, ActivationDateUTC, ExpirationDateUTC,
Sitecode + '-' + Cardcode AS Credential, 'Badge' AS Type,
CASE
WHEN isActive = 0
THEN 'InActive'
WHEN ActivationDateUTC > GetUTCDate()
THEN 'Pending'
WHEN ExpirationDAteUTC < GetUTCDate()
THEN 'Expired'
ELSE 'Active'
END AS Status
FROM
EAC.Credential
JOIN
EAC.WiegandCredential ON Credential.ID = WiegandCredential.CredentialId
WHERE
PersonID = '32'
Where I would like to run the second query for each user of the first query using EAC.Person.Id instead of the '32'.
I would like all the data to be returned in one Dataset so I can use it in Report Builder.
I have been fighting with this all day and am hoping one of you smart guys can give me a hand. Thanks in advance.
Based on your description in the comments, I understand that the connection between the two datasets is actually the PersonID field, which exists in both EAC.Credential and EAC.Person; however, in EAC.Credential, duplicate values exist for PersonID, and you want only the most recent one for each PersonID.
There are a few ways to do this, and it will depend on the number of rows returned, the indexes, etc., but I think maybe you're looking for something like this...?
SELECT
EAC.Person.FirstName
,EAC.Person.Id
,EAC.Person.LastName
,EAC.Person.EmployeeId
,EAC.Person.IsDeleted
,Controller.Cards.SiteCode
,Controller.Cards.CardCode
,Controller.Cards.ActivationDate
,Controller.Cards.ExpirationDate
,Controller.Cards.Status
,EAC.[Group].Name
,X.IsActive
,X.ActivationDateUTC
,X.ExpirationDateUTC
,X.Credential
,X.Type
,X.Status
FROM EAC.Person
INNER JOIN Controller.Cards
ON EAC.Person.Id = Controller.Cards.PersonId
INNER JOIN EAC.GroupPersonMap
ON EAC.Person.Id = EAC.GroupPersonMap.PersonId
INNER JOIN EAC.[Group]
ON EAC.GroupPersonMap.GroupId = EAC.[Group].Id
CROSS APPLY
(
SELECT TOP 1
IsActive
,ActivationDateUTC
,ExpirationDateUTC
,Sitecode + '-' + Cardcode AS Credential
,'Badge' AS Type
,'Status' =
CASE
WHEN isActive = 0
THEN 'InActive'
WHEN ActivationDateUTC > GETUTCDATE()
THEN 'Pending'
WHEN ExpirationDateUTC < GETUTCDATE()
THEN 'Expired'
ELSE 'Active'
END
FROM EAC.Credential
INNER JOIN EAC.WiegandCredential
ON EAC.Credential.ID = EAC.WiegandCredential.CredentialId
WHERE EAC.Credential.PersonID = EAC.Person.PersonID
ORDER BY EAC.Credential.ID DESC
) AS X
-- Optionally, you can also add conditions to return specific rows, i.e.:
-- WHERE EAC.Person.PersonID = 32
This option uses a CROSS APPLY, which means that every row of the first dataset will return additional values from the second dataset, based on the criteria that you described. In this CROSS APPLY, I'm joining the two datasets based on the fact that PersonID exists in both EAC.Person (in your first dataset) as well as in EAC.Credential. I then specify that I want only the TOP 1 row for each PersonID, with an ORDER BY specifying that we want the most recent (highest) value of ID for each PersonID.
The CROSS APPLY is aliased as "X", so in your original SELECT you now have several values prefixed with the X. alias, which just means that you're taking these fields from the second query and attaching them to your original results.
CROSS APPLY requires that a matching entry exists in both subsets of data, much like an INNER JOIN, so you'll want to check and make sure that the relevant values exist and are returned correctly.
I think this is pretty close to the direction you're trying to go. If not, let me know and I'll update the answer. Good luck!
Try like this;
select Query1.*, Query2.* from (
SELECT
EAC.Person.FirstName,
EAC.Person.Id as PersonId,
EAC.Person.LastName,
EAC.Person.EmployeeId,
EAC.Person.IsDeleted,
Controller.Cards.SiteCode,
Controller.Cards.CardCode,
Controller.Cards.ActivationDate,
Controller.Cards.ExpirationDate,
Controller.Cards.Status,
EAC.[Group].Name
FROM
EAC.Person
INNER JOIN
Controller.Cards ON EAC.Person.Id = Controller.Cards.PersonId
INNER JOIN
EAC.GroupPersonMap ON EAC.Person.Id = EAC.GroupPersonMap.PersonId
INNER JOIN
EAC.[Group] ON EAC.GroupPersonMap.GroupId = EAC.[Group].Id)
Query1 inner join (SELECT top 100
IsActive, ActivationDateUTC, ExpirationDateUTC,
Sitecode + '-' + Cardcode AS Credential, 'Badge' AS Type,
CASE
WHEN isActive = 0
THEN 'InActive'
WHEN ActivationDateUTC > GetUTCDate()
THEN 'Pending'
WHEN ExpirationDAteUTC < GetUTCDate()
THEN 'Expired'
ELSE 'Active'
END AS Status
FROM
EAC.Credential
JOIN
EAC.WiegandCredential ON Credential.ID = WiegandCredential.CredentialId
ORDER BY EAC.Credential.ID DESC) Query2 ON Query1.PersonId = Query2.PersonID
Just select two queries to join them like Query1 and Query2 by equaling PersonId data.

joining the similarities in a database

I've got 2 table which have file paths in then.
the first table has 2 column:
P_ID
Path-Snip
the second table also has 2 column:
Path_def
Value
The Path_Snip is a reduced path which is related to an area of the program.
What I'd like to do is Join the two table to have a table with 4 column:
P_ID
Path_Snip
Path_def
Value
I'd like to match the paths together so that the similar Path_Snips are joined with the similar Path_def:
Example of what I'd like the table to look like:
P_ID = 1
Path_Snip = branches/Projects/Enhancements2015Q1/Encryption
Path_def = branches/Projects/Enhancements2015Q1/Encryption/Encryption.csproj
Value = 12
As the 2 paths match I'd like to keep them together
I think you're looking for a JOIN with LIKE:
SELECT
t1.p_ID
, t1.Path_snip
, t2.Path_def
, t2.Value
FROM table1 t1
INNER JOIN table2 t2 ON t2.Path_def LIKE '%' + t1.Path_Snip + '%'
You need like operator in join:
select * from t1
join t2 on t1.Path-Snip like t2.Path_def + '%' or t2.Path_def like t1.Path-Snip + '%'
Based on comment...
SELECT
s.p_ID
, s.Path-snip
, d.Path_def
, d.Value
FROM tblDef d
LEFT JOIN tblSnip s ON d.Path_def LIKE s.Path-Snip + '%'
I have used LEFT join here in case you have entries in the Def table with nothing in the snip table.
Radu's question on 'best match' in the comments is a valid one. But, assuming a simple scenario, of the left portion of the paths matching, and assuming left,right joins are not required and inner joins will work (both tables will have uniquely matching records al the time), here is a different version, using SubString :
Select T1.*, T2.* From Table1 T1, Table2 T2 Where
Substring(T2.Path_def, 1, Len(T1.Path_snip)) = T1.Path_snip

Conditional JOIN Statement SQL Server

Is it possible to do the following:
IF [a] = 1234 THEN JOIN ON TableA
ELSE JOIN ON TableB
If so, what is the correct syntax?
I think what you are asking for will work by joining the Initial table to both Option_A and Option_B using LEFT JOIN, which will produce something like this:
Initial LEFT JOIN Option_A LEFT JOIN NULL
OR
Initial LEFT JOIN NULL LEFT JOIN Option_B
Example code:
SELECT i.*, COALESCE(a.id, b.id) as Option_Id, COALESCE(a.name, b.name) as Option_Name
FROM Initial_Table i
LEFT JOIN Option_A_Table a ON a.initial_id = i.id AND i.special_value = 1234
LEFT JOIN Option_B_Table b ON b.initial_id = i.id AND i.special_value <> 1234
Once you have done this, you 'ignore' the set of NULLS. The additional trick here is in the SELECT line, where you need to decide what to do with the NULL fields. If the Option_A and Option_B tables are similar, then you can use the COALESCE function to return the first NON NULL value (as per the example).
The other option is that you will simply have to list the Option_A fields and the Option_B fields, and let whatever is using the ResultSet to handle determining which fields to use.
This is just to add the point that query can be constructed dynamically based on conditions.
An example is given below.
DECLARE #a INT = 1235
DECLARE #sql VARCHAR(MAX) = 'SELECT * FROM [sourceTable] S JOIN ' + IIF(#a = 1234,'[TableA] A ON A.col = S.col','[TableB] B ON B.col = S.col')
EXEC(#sql)
--Query will be
/*
SELECT * FROM [sourceTable] S JOIN [TableB] B ON B.col = S.col
*/
You can solve this with union
select a, b
from tablea
join tableb on tablea.a = tableb.a
where b = 1234
union
select a, b
from tablea
join tablec on tablec.a = tableb.a
where b <> 1234
I disagree with the solution suggesting 2 left joins. I think a table-valued function is more appropriate so you don't have all the coalescing and additional joins for each condition you would have.
CREATE FUNCTION f_GetData (
#Logic VARCHAR(50)
) RETURNS #Results TABLE (
Content VARCHAR(100)
) AS
BEGIN
IF #Logic = '1234'
INSERT #Results
SELECT Content
FROM Table_1
ELSE
INSERT #Results
SELECT Content
FROM Table_2
RETURN
END
GO
SELECT *
FROM InputTable
CROSS APPLY f_GetData(InputTable.Logic) T
I think it will be better to think about your query in a different way and treat them more like sets.
I do believe if you make two separate queries then join them using UNION, It will be much better in performance and more readable.

TSQL optimizing code for NOT IN

I inherit an old SQL script that I want to optimize but after several tests, I must admit that all my tests only creates huge SQL with repetitive blocks. I would like to know if someone can propose a better code for the following pattern (see code below). I don't want to use temporary table (WITH). For simplicity, I only put 3 levels (table TMP_C, TMP_D and TMP_E) but the original SQL have 8 levels.
WITH
TMP_A AS (
SELECT
ID,
Field_X
FROM A
TMP_B AS(
SELECT DISTINCT
ID,
Field_Y,
CASE
WHEN Field_Z IN ('TEST_1','TEST_2') THEN 'CATEG_1'
WHEN Field_Z IN ('TEST_3','TEST_4') THEN 'CATEG_2'
WHEN Field_Z IN ('TEST_5','TEST_6') THEN 'CATEG_3'
ELSE 'CATEG_4'
END AS CATEG
FROM B
INNER JOIN TMP_A
ON TMP_A.ID=TMP_B.ID),
TMP_C AS (
SELECT DISTINCT
ID,
CATEG
FROM TMP_B
WHERE CATEG='CATEG_1'),
TMP_D AS (
SELECT DISTINCT
ID,
CATEG
FROM TMP_B
WHERE CATEG='CATEG_2' AND ID NOT IN (SELECT ID FROM TMP_C)),
TMP_E AS (
SELECT DISTINCT
ID,
CATEG
FROM TMP_B
WHERE CATEG='CATEG_3'
AND ID NOT IN (SELECT ID FROM TMP_C)
AND ID NOT IN (SELECT ID FROM TMP_D))
SELECT * FROM TMP_C
UNION
SELECT * FROM TMP_D
UNION
SELECT * FROM TMP_E
Many thanks in advance for your help.
First off, select DISTINCT will prevent duplicates from the result set, so you are overworking the condition. By adding the "WITH" definitions and trying to nest their use makes it more confusing to follow. The data is ultimately all coming from the "B" table where also has key match in "A". Lets start with just that... And since you are not using anything from the (B)Field_Y or (A)Field_X in your result set, don't add them to the mix of confusion.
SELECT DISTINCT
B.ID,
CASE WHEN B.Field_Z IN ('TEST_1','TEST_2') THEN 'CATEG_1'
WHEN B.Field_Z IN ('TEST_3','TEST_4') THEN 'CATEG_2'
WHEN B.Field_Z IN ('TEST_5','TEST_6') THEN 'CATEG_3'
ELSE 'CATEG_4'
END AS CATEG
FROM
B JOIN A ON B.ID = A.ID
WHERE
B.Field_Z IN ( 'TEST_1', 'TEST_2', 'TEST_3', 'TEST_4', 'TEST_5', 'TEST_6' )
The where clause will only include those category qualifying values you want and still have the results per each category.
Now, if you actually needed other values from your "Field_Y" or "Field_X", then that would generate a different query. However, your Tmp_C, Tmp_D and Tmp_E are only asking for the ID and CATEG columns anyhow.
This may perform better
SELECT DISTINCT B.ID, 'CATEG_1'
FROM
B JOIN A ON B.ID = A.ID
WHERE
B.Field_Z IN ( 'TEST_1', 'TEST_2')
UNION
SELECT DISTINCT B.ID, 'CATEG_2'
FROM
B JOIN A ON B.ID = A.ID
WHERE
B.Field_Z IN ( 'TEST_3', 'TEST_4')
...

Join subquery with min

I'm pulling my hair out over a subquery that I'm using to avoid about 100 duplicates (out of about 40k records). The records that are duplicated are showing up because they have 2 dates in h2.datecreated for a valid reason, so I can't just scrub the data.
I'm trying to get only the earliest date to return. The first subquery (that starts with "select distinct address_id", with the MIN) works fine on it's own...no duplicates are returned. So it would seem that the left join (or just plain join...I've tried that too) couldn't possibly see the second h2.datecreated, since it doesn't even show up in the subquery. But when I run the whole query, it's returning 2 values for some ipc.mfgid's, one with the h2.datecreated that I want, and the other one that I don't want.
I know it's got to be something really simple, or something that just isn't possible. It really seems like it should work! This is MSSQL. Thanks!
select distinct ipc.mfgid as IPC, h2.datecreated,
case when ad.Address is null
then ad.buildingname end as Address, cast(trace.name as varchar)
+ '-' + cast(trace.Number as varchar) as ONT,
c.ACCOUNT_Id,
case when h.datecreated is not null then h.datecreated
else h2.datecreated end as Install
from equipmentjoin as ipc
left join historyjoin as h on ipc.id = h.EQUIPMENT_Id
and h.type like 'add'
left join circuitjoin as c on ipc.ADDRESS_Id = c.ADDRESS_Id
and c.GRADE_Code like '%hpna%'
join (select distinct address_id, equipment_id,
min(datecreated) as datecreated, comment
from history where comment like 'MAC: 5%' group by equipment_id, address_id, comment)
as h2 on c.address_id = h2.address_id
left join (select car.id, infport.name, carport.number, car.PCIRCUITGROUP_Id
from circuit as car (NOLOCK)
join port as carport (NOLOCK) on car.id = carport.CIRCUIT_Id
and carport.name like 'lead%'
and car.GRADE_Id = 29
join circuit as inf (NOLOCK) on car.CCIRCUITGROUP_Id = inf.PCIRCUITGROUP_Id
join port as infport (NOLOCK) on inf.id = infport.CIRCUIT_Id
and infport.name like '%olt%' )
as trace on c.ccircuitgroup_id = trace.pcircuitgroup_id
join addressjoin as ad (NOLOCK) on ipc.address_id = ad.id
The typical approach to only getting the lowest row is one of the following. You didn't bother to specify what version of SQL Server you're using, what you want to do with ties, and I have little interest to try to work this into your complex query, so I'll show you an abstract simplification for different versions.
SQL Server 2000
SELECT x.grouping_column, x.min_column, x.other_columns ...
FROM dbo.foo AS x
INNER JOIN
(
SELECT grouping_column, min_column = MIN(min_column)
FROM dbo.foo GROUP BY grouping_column
) AS y
ON x.grouping_column = y.grouping_column
AND x.min_column = y.min_column;
SQL Server 2005+
;WITH x AS
(
SELECT grouping_column, min_column, other_columns,
rn = ROW_NUMBER() OVER (ORDER BY min_column)
FROM dbo.foo
)
SELECT grouping_column, min_column, other_columns
FROM x
WHERE rn = 1;
This subqery:
select distinct address_id, equipment_id,
min(datecreated) as datecreated, comment
from history where comment like 'MAC: 5%' group by equipment_id, address_id, comment
Probably will return multiple rows because the comment is not guaranteed to be the same.
Try this instead:
CROSS APPLY (
SELECT TOP 1 H2.DateCreated, H2.Comment -- H2.Equipment_id wasn't used
FROM History H2
WHERE
H2.Comment LIKE 'MAC: 5%'
AND C.Address_ID = H2.Address_ID
ORDER BY DateCreated
) H2
Switch that to OUTER APPLY in case you want rows that don't have a matching desired history entry.

Resources