SQL Server : trying to insert a count of zero when a record doesn't exist - sql-server

I am trying to modify the results of a query to populate a zero when a certain status doesn't exist.
In my base result I have something that looks like this:
But when a certain example doesn't appear in my table, I need a way to have a row show up with a zero for reporting needs, something like this:
I was trying to use a CTE maybe to populate those and left join it up...but doesn't seem to be working the way I want.
WITH DummyValues AS
(
SELECT 'Yellow' AS Val
UNION ALL
SELECT 'Red'
UNION ALL
SELECT 'Gray'
)
SELECT D.Val, V.PlntCd, COUNT(UpgradeMeasure)
FROM reporting.vw_SOTAgingView V
LEFT OUTER JOIN DummyValues D ON D.Val = V.UpgradeMeasure
GROUP BY D.Val, V.PlntCd
Is this an easy thing I am just missing something simple?

You can use a LEFT OUTER JOIN like this to always include the statuses (I switched the order of the tables since that is usually easier to read for most people):
SELECT
D.Val,
V.PlntCd,
COALESCE(COUNT(UpgradeMeasure), 0) AS [Count]
FROM (SELECT 'Yellow' UNION ALL SELECT 'Red' UNION ALL SELECT 'Gray') D
LEFT OUTER JOIN reporting.vw_SOTAgingView V
ON D.Val = V.UpgradeMeasure
GROUP BY D.Val, V.PlntCd
Just note that this won't exactly get your desired set. The "PlntCd" will be NULL if no match is found. If you want to ensure you cover all your plants, you need to start with a complete listing of plants and CROSS JOIN that source to statuses first. This might look like:
SELECT
D.Val, -- From cross-join
P.PlntCd, -- From source
COALESCE(COUNT(UpgradeMeasure), 0) AS [Count]
FROM (SELECT DISTINCT PlntCd FROM reporting.vw_SOTAgingView) P
CROSS JOIN (SELECT 'Yellow' UNION ALL SELECT 'Red' UNION ALL SELECT 'Gray') D
LEFT OUTER JOIN reporting.vw_SOTAgingView V
ON D.Val = V.UpgradeMeasure
AND P.PlntCd = V.PlntCd -- Also join to source to prevent dupes
GROUP BY D.Val, P.PlntCd -- Use source plant code

You have the join backwards.
You left join against the subset. (Or do it the way you have it and RIGHT OUTER JOIN, except no one really uses right joins)
SELECT
*
FROM
TableWithAllData All
LEFT JOIN TableWithSomeData Some ON Some.Id = All.id

Related

SQL combine two queries result into one dataset

I am trying to combine two SQL queries the first is
SELECT
EAC.Person.FirstName,
EAC.Person.Id,
EAC.Person.LastName,
EAC.Person.EmployeeId,
EAC.Person.IsDeleted,
Controller.Cards.SiteCode,
Controller.Cards.CardCode,
Controller.Cards.ActivationDate,
Controller.Cards.ExpirationDate,
Controller.Cards.Status,
EAC.[Group].Name
FROM
EAC.Person
INNER JOIN
Controller.Cards ON EAC.Person.Id = Controller.Cards.PersonId
INNER JOIN
EAC.GroupPersonMap ON EAC.Person.Id = EAC.GroupPersonMap.PersonId
INNER JOIN
EAC.[Group] ON EAC.GroupPersonMap.GroupId = EAC.[Group].Id
And the second one is
SELECT
IsActive, ActivationDateUTC, ExpirationDateUTC,
Sitecode + '-' + Cardcode AS Credential, 'Badge' AS Type,
CASE
WHEN isActive = 0
THEN 'InActive'
WHEN ActivationDateUTC > GetUTCDate()
THEN 'Pending'
WHEN ExpirationDAteUTC < GetUTCDate()
THEN 'Expired'
ELSE 'Active'
END AS Status
FROM
EAC.Credential
JOIN
EAC.WiegandCredential ON Credential.ID = WiegandCredential.CredentialId
WHERE
PersonID = '32'
Where I would like to run the second query for each user of the first query using EAC.Person.Id instead of the '32'.
I would like all the data to be returned in one Dataset so I can use it in Report Builder.
I have been fighting with this all day and am hoping one of you smart guys can give me a hand. Thanks in advance.
Based on your description in the comments, I understand that the connection between the two datasets is actually the PersonID field, which exists in both EAC.Credential and EAC.Person; however, in EAC.Credential, duplicate values exist for PersonID, and you want only the most recent one for each PersonID.
There are a few ways to do this, and it will depend on the number of rows returned, the indexes, etc., but I think maybe you're looking for something like this...?
SELECT
EAC.Person.FirstName
,EAC.Person.Id
,EAC.Person.LastName
,EAC.Person.EmployeeId
,EAC.Person.IsDeleted
,Controller.Cards.SiteCode
,Controller.Cards.CardCode
,Controller.Cards.ActivationDate
,Controller.Cards.ExpirationDate
,Controller.Cards.Status
,EAC.[Group].Name
,X.IsActive
,X.ActivationDateUTC
,X.ExpirationDateUTC
,X.Credential
,X.Type
,X.Status
FROM EAC.Person
INNER JOIN Controller.Cards
ON EAC.Person.Id = Controller.Cards.PersonId
INNER JOIN EAC.GroupPersonMap
ON EAC.Person.Id = EAC.GroupPersonMap.PersonId
INNER JOIN EAC.[Group]
ON EAC.GroupPersonMap.GroupId = EAC.[Group].Id
CROSS APPLY
(
SELECT TOP 1
IsActive
,ActivationDateUTC
,ExpirationDateUTC
,Sitecode + '-' + Cardcode AS Credential
,'Badge' AS Type
,'Status' =
CASE
WHEN isActive = 0
THEN 'InActive'
WHEN ActivationDateUTC > GETUTCDATE()
THEN 'Pending'
WHEN ExpirationDateUTC < GETUTCDATE()
THEN 'Expired'
ELSE 'Active'
END
FROM EAC.Credential
INNER JOIN EAC.WiegandCredential
ON EAC.Credential.ID = EAC.WiegandCredential.CredentialId
WHERE EAC.Credential.PersonID = EAC.Person.PersonID
ORDER BY EAC.Credential.ID DESC
) AS X
-- Optionally, you can also add conditions to return specific rows, i.e.:
-- WHERE EAC.Person.PersonID = 32
This option uses a CROSS APPLY, which means that every row of the first dataset will return additional values from the second dataset, based on the criteria that you described. In this CROSS APPLY, I'm joining the two datasets based on the fact that PersonID exists in both EAC.Person (in your first dataset) as well as in EAC.Credential. I then specify that I want only the TOP 1 row for each PersonID, with an ORDER BY specifying that we want the most recent (highest) value of ID for each PersonID.
The CROSS APPLY is aliased as "X", so in your original SELECT you now have several values prefixed with the X. alias, which just means that you're taking these fields from the second query and attaching them to your original results.
CROSS APPLY requires that a matching entry exists in both subsets of data, much like an INNER JOIN, so you'll want to check and make sure that the relevant values exist and are returned correctly.
I think this is pretty close to the direction you're trying to go. If not, let me know and I'll update the answer. Good luck!
Try like this;
select Query1.*, Query2.* from (
SELECT
EAC.Person.FirstName,
EAC.Person.Id as PersonId,
EAC.Person.LastName,
EAC.Person.EmployeeId,
EAC.Person.IsDeleted,
Controller.Cards.SiteCode,
Controller.Cards.CardCode,
Controller.Cards.ActivationDate,
Controller.Cards.ExpirationDate,
Controller.Cards.Status,
EAC.[Group].Name
FROM
EAC.Person
INNER JOIN
Controller.Cards ON EAC.Person.Id = Controller.Cards.PersonId
INNER JOIN
EAC.GroupPersonMap ON EAC.Person.Id = EAC.GroupPersonMap.PersonId
INNER JOIN
EAC.[Group] ON EAC.GroupPersonMap.GroupId = EAC.[Group].Id)
Query1 inner join (SELECT top 100
IsActive, ActivationDateUTC, ExpirationDateUTC,
Sitecode + '-' + Cardcode AS Credential, 'Badge' AS Type,
CASE
WHEN isActive = 0
THEN 'InActive'
WHEN ActivationDateUTC > GetUTCDate()
THEN 'Pending'
WHEN ExpirationDAteUTC < GetUTCDate()
THEN 'Expired'
ELSE 'Active'
END AS Status
FROM
EAC.Credential
JOIN
EAC.WiegandCredential ON Credential.ID = WiegandCredential.CredentialId
ORDER BY EAC.Credential.ID DESC) Query2 ON Query1.PersonId = Query2.PersonID
Just select two queries to join them like Query1 and Query2 by equaling PersonId data.

IN clause not working within subquery inner join

I am trying to pull a list of most recent lab values in 2015. All lab value are stored in one table and I need to both limit the data to be within 2015 and limit it to certain types of labs so the max date doesn't give me the most recent lab regardless of type. Although I use the IN clause, labs of other types are included. I need the last value regardless of what type of lab they have as long as it's within the types identified in the IN clause (i.e. I don't need the last value of each type)
select distinct
t2.pat_id
,t2.pat_last_name "PatientLast"
,t2.pat_first_name "PatFirst"
,t2.birth_date
,t1.contact_date "ContactDate"
,t3.name "EncounterType"
,t4.ord_num_value "Numeric Value"
,t4.result_date
from table1 t1
inner join table2 t2 on t1.pat_id = t2.pat_id
inner join table3 t3 on t1.enc_type_c = t3.disp_enc_type_c
inner join table4 t4 on t1.pat_enc_csn_id = t4.pat_enc_csn_id
inner join
(
select
table1.pat_id
,max(table1.contact_date) as LastResult
,table4.component_id
from table1
**inner join order_results on table1.pat_enc_csn_id = table4.pat_enc_csn_id
where table4.component_id in ('1526664','1558024','1004','2667', '1230000002','1564041')
and table1.contact_date between '2015-01-01' and '2015-12-31'
group by table1.pat_id, table4.component_id
) enc2** on table1.pat_id = enc2.pat_id
and table1.contact_date = enc2.LastResult
order by table2.pat_last_name, table2.pat_first_name
Your query is a bit hard to follow. But one method is to use row_number(). Something like this:
select t.*
from (select . . .,
row_number() over (partition by pat_id order by contact_date desc) as seqnum
from . . .
where . . .
) t
where seqnum = 1;
You have where conditions in the subquery that are not in the outer query, so it is hard to follow the intended logic. The use of row_number() is much simpler than a subquery, because you don't have to repeat any logic.

How to join one select with another when the first one not always returns a value for specific row?

I have a complex query to retrieve some results:
EDITED QUERY (added the UNION ALL):
SELECT t.*
FROM (
SELECT
dbo.Intervencao.INT_Processo, analista,
ETS.ETS_Sigla, ATC.ATC_Sigla, PAT.PAT_Sigla, dbo.Assunto.SNT_Peso,
CASE
WHEN ETS.ETS_Sigla = 'PE' AND (PAT.PAT_Sigla = 'LIB' OR PAT.PAT_Sigla = 'LBR') THEN (0.3*SNT_Peso)
WHEN ETS.ETS_Sigla = 'CD' THEN (0.3*SNT_Peso)*0.3
ELSE SNT_Peso
END AS PESOAREA,
CASE
WHEN a.max_TEA_FimTarefa IS NULL THEN a.max_TEA_InicioTarefa
ELSE a.max_TEA_FimTarefa
END AS DATA_INICIO_TERMINO,
ROW_NUMBER() OVER (PARTITION BY ATC.ATC_Sigla, a.SRV_Id ORDER BY TEA_FimTarefa DESC) AS seqnum
FROM dbo.Tarefa AS t
INNER JOIN (
SELECT
MAX(dbo.TarefaEtapaAreaTecnica.TEA_InicioTarefa) AS max_TEA_InicioTarefa,
MAX (dbo.TarefaEtapaAreaTecnica.TEA_FimTarefa) AS max_TEA_FimTarefa,
dbo.Pessoa.PFJ_Descri AS analista, dbo.AreaTecnica.ATC_Id, dbo.Tarefa.SRV_Id
FROM dbo.TarefaEtapaAreaTecnica
LEFT JOIN dbo.Tarefa ON dbo.TarefaEtapaAreaTecnica.TRF_Id = dbo.Tarefa.TRF_Id
LEFT JOIN dbo.AreaTecnica ON dbo.TarefaEtapaAreaTecnica.ATC_Id = dbo.AreaTecnica.ATC_Id
LEFT JOIN dbo.ServicoAreaTecnica ON dbo.TarefaEtapaAreaTecnica.ATC_Id = dbo.ServicoAreaTecnica.ATC_Id
AND dbo.Tarefa.SRV_Id = dbo.ServicoAreaTecnica.SRV_Id
INNER JOIN dbo.Pessoa ON dbo.Pessoa.PFJ_Id = dbo.ServicoAreaTecnica.PFJ_Id_Analista
GROUP BY dbo.AreaTecnica.ATC_Id, dbo.Tarefa.SRV_Id, dbo.Pessoa.PFJ_Descri
) AS a ON t.SRV_Id = a.SRV_Id
INNER JOIN dbo.TarefaEtapaAreaTecnica AS TarefaEtapaAreaTecnica_1 ON
t.TRF_Id = TarefaEtapaAreaTecnica_1.TRF_Id
AND a.ATC_Id = TarefaEtapaAreaTecnica_1.ATC_Id
AND a.max_TEA_InicioTarefa = TarefaEtapaAreaTecnica_1.TEA_InicioTarefa
LEFT JOIN AreaTecnica ATC ON TarefaEtapaAreaTecnica_1.ATC_Id = ATC.ATC_Id
LEFT JOIN Etapa ETS ON TarefaEtapaAreaTecnica_1.ETS_Id = ETS.ETS_Id
LEFT JOIN ParecerTipo PAT ON TarefaEtapaAreaTecnica_1.PAT_Id = PAT.PAT_Id
LEFT OUTER JOIN dbo.Servico ON a.SRV_Id = dbo.Servico.SRV_Id
INNER JOIN dbo.Intervencao ON dbo.Servico.INT_Id = dbo.Intervencao.INT_Id
LEFT JOIN dbo.Assunto ON dbo.Servico.SNT_Id = dbo.Assunto.SNT_Id
) t
The result is following:
It works good, the problem is that I was asked that if when a row is not present on this query, it must contain values from another table (ServicoAreaTecnica), so I got this query for the other table based on crucial information of the first query. So if I UNION ALL I get this:
Query1 +
UNION ALL
SELECT INN.INT_Processo,
PES.PFJ_Descri,
NULL, --ETS.ETS_Sigla,
ART.ATC_Sigla,
NULL ,--PAT.PAT_Sigla,
ASS.SNT_Peso,
NULL, --PESOAREA
NULL, --DATA_INICIO_TERMINO
NULL --seqnum
FROM dbo.ServicoAreaTecnica AS SAT
INNER JOIN dbo.AreaTecnica AS ART ON ART.ATC_Id = SAT.ATC_Id
INNER JOIN dbo.Servico AS SER ON SER.SRV_Id = SAT.SRV_Id
INNER JOIN dbo.Assunto AS ASS ON ASS.SNT_Id = SER.SNT_Id
INNER JOIN dbo.Intervencao AS INN ON INN.INT_Id = SER.INT_Id
INNER JOIN dbo.Pessoa AS PES ON PES.PFJ_Id = SAT.PFJ_Id_Analista
The result is following:
So what I want to do is to remove row number 1 because row number 2 exists on the first query, I think I got it explained better this time. The result should be only row number 1, row number 2 would appear only if query 1 doesn't retrieve a row for that particular INN.INT_Processo.
Thanks!
Ok, there are two ways to reduce your record set. Given that you've already written the code to produce the table with the extra rows, it might be easiest to just add code to reduce that:
Select * from
(Select *
, Row_Number() over
(partition by IntProcesso, Analista order by ISNULL(seqnum, 0) desc) as RN
from MyResults) a
where RN = 1
This will assign row_number 1 to any rows that came from your first query, or to any rows from the second query that do not have matches in the first query, then filter out extra rows.
You could also use outer joins with isnull or coalesce, as others have suggested. Something like this:
Select ISNULL(a.IntProcesso, b.IntProcesso) as IntProcesso
, ISNULL(a.Analista, b.Analista) as Analista
, ISNULL(a.ETSsigla, b.ETSsigla) as ETSsigla
[repeat for the rest of your columns]
from Table1 a
full outer join Table2 b
on a.IntProcesso = b.IntProcesso and a.Analista = b.Analista
Your code is hard to read, because of the lengthy names of everything (and to be honest, the fact that they're in a language I don't speak also makes it a lot harder).
But how about: replacing your INNER JOINs with LEFT JOINs, adding more LEFT JOINs to draw in the alternative tables, and introducing ISNULL clauses for each variable you want in the results?
If you do something like ... Query1 Right Join Query2 On ... that should get only the rows in Query2 that don't appear in Query 1.

TSQL optimizing code for NOT IN

I inherit an old SQL script that I want to optimize but after several tests, I must admit that all my tests only creates huge SQL with repetitive blocks. I would like to know if someone can propose a better code for the following pattern (see code below). I don't want to use temporary table (WITH). For simplicity, I only put 3 levels (table TMP_C, TMP_D and TMP_E) but the original SQL have 8 levels.
WITH
TMP_A AS (
SELECT
ID,
Field_X
FROM A
TMP_B AS(
SELECT DISTINCT
ID,
Field_Y,
CASE
WHEN Field_Z IN ('TEST_1','TEST_2') THEN 'CATEG_1'
WHEN Field_Z IN ('TEST_3','TEST_4') THEN 'CATEG_2'
WHEN Field_Z IN ('TEST_5','TEST_6') THEN 'CATEG_3'
ELSE 'CATEG_4'
END AS CATEG
FROM B
INNER JOIN TMP_A
ON TMP_A.ID=TMP_B.ID),
TMP_C AS (
SELECT DISTINCT
ID,
CATEG
FROM TMP_B
WHERE CATEG='CATEG_1'),
TMP_D AS (
SELECT DISTINCT
ID,
CATEG
FROM TMP_B
WHERE CATEG='CATEG_2' AND ID NOT IN (SELECT ID FROM TMP_C)),
TMP_E AS (
SELECT DISTINCT
ID,
CATEG
FROM TMP_B
WHERE CATEG='CATEG_3'
AND ID NOT IN (SELECT ID FROM TMP_C)
AND ID NOT IN (SELECT ID FROM TMP_D))
SELECT * FROM TMP_C
UNION
SELECT * FROM TMP_D
UNION
SELECT * FROM TMP_E
Many thanks in advance for your help.
First off, select DISTINCT will prevent duplicates from the result set, so you are overworking the condition. By adding the "WITH" definitions and trying to nest their use makes it more confusing to follow. The data is ultimately all coming from the "B" table where also has key match in "A". Lets start with just that... And since you are not using anything from the (B)Field_Y or (A)Field_X in your result set, don't add them to the mix of confusion.
SELECT DISTINCT
B.ID,
CASE WHEN B.Field_Z IN ('TEST_1','TEST_2') THEN 'CATEG_1'
WHEN B.Field_Z IN ('TEST_3','TEST_4') THEN 'CATEG_2'
WHEN B.Field_Z IN ('TEST_5','TEST_6') THEN 'CATEG_3'
ELSE 'CATEG_4'
END AS CATEG
FROM
B JOIN A ON B.ID = A.ID
WHERE
B.Field_Z IN ( 'TEST_1', 'TEST_2', 'TEST_3', 'TEST_4', 'TEST_5', 'TEST_6' )
The where clause will only include those category qualifying values you want and still have the results per each category.
Now, if you actually needed other values from your "Field_Y" or "Field_X", then that would generate a different query. However, your Tmp_C, Tmp_D and Tmp_E are only asking for the ID and CATEG columns anyhow.
This may perform better
SELECT DISTINCT B.ID, 'CATEG_1'
FROM
B JOIN A ON B.ID = A.ID
WHERE
B.Field_Z IN ( 'TEST_1', 'TEST_2')
UNION
SELECT DISTINCT B.ID, 'CATEG_2'
FROM
B JOIN A ON B.ID = A.ID
WHERE
B.Field_Z IN ( 'TEST_3', 'TEST_4')
...

Join subquery with min

I'm pulling my hair out over a subquery that I'm using to avoid about 100 duplicates (out of about 40k records). The records that are duplicated are showing up because they have 2 dates in h2.datecreated for a valid reason, so I can't just scrub the data.
I'm trying to get only the earliest date to return. The first subquery (that starts with "select distinct address_id", with the MIN) works fine on it's own...no duplicates are returned. So it would seem that the left join (or just plain join...I've tried that too) couldn't possibly see the second h2.datecreated, since it doesn't even show up in the subquery. But when I run the whole query, it's returning 2 values for some ipc.mfgid's, one with the h2.datecreated that I want, and the other one that I don't want.
I know it's got to be something really simple, or something that just isn't possible. It really seems like it should work! This is MSSQL. Thanks!
select distinct ipc.mfgid as IPC, h2.datecreated,
case when ad.Address is null
then ad.buildingname end as Address, cast(trace.name as varchar)
+ '-' + cast(trace.Number as varchar) as ONT,
c.ACCOUNT_Id,
case when h.datecreated is not null then h.datecreated
else h2.datecreated end as Install
from equipmentjoin as ipc
left join historyjoin as h on ipc.id = h.EQUIPMENT_Id
and h.type like 'add'
left join circuitjoin as c on ipc.ADDRESS_Id = c.ADDRESS_Id
and c.GRADE_Code like '%hpna%'
join (select distinct address_id, equipment_id,
min(datecreated) as datecreated, comment
from history where comment like 'MAC: 5%' group by equipment_id, address_id, comment)
as h2 on c.address_id = h2.address_id
left join (select car.id, infport.name, carport.number, car.PCIRCUITGROUP_Id
from circuit as car (NOLOCK)
join port as carport (NOLOCK) on car.id = carport.CIRCUIT_Id
and carport.name like 'lead%'
and car.GRADE_Id = 29
join circuit as inf (NOLOCK) on car.CCIRCUITGROUP_Id = inf.PCIRCUITGROUP_Id
join port as infport (NOLOCK) on inf.id = infport.CIRCUIT_Id
and infport.name like '%olt%' )
as trace on c.ccircuitgroup_id = trace.pcircuitgroup_id
join addressjoin as ad (NOLOCK) on ipc.address_id = ad.id
The typical approach to only getting the lowest row is one of the following. You didn't bother to specify what version of SQL Server you're using, what you want to do with ties, and I have little interest to try to work this into your complex query, so I'll show you an abstract simplification for different versions.
SQL Server 2000
SELECT x.grouping_column, x.min_column, x.other_columns ...
FROM dbo.foo AS x
INNER JOIN
(
SELECT grouping_column, min_column = MIN(min_column)
FROM dbo.foo GROUP BY grouping_column
) AS y
ON x.grouping_column = y.grouping_column
AND x.min_column = y.min_column;
SQL Server 2005+
;WITH x AS
(
SELECT grouping_column, min_column, other_columns,
rn = ROW_NUMBER() OVER (ORDER BY min_column)
FROM dbo.foo
)
SELECT grouping_column, min_column, other_columns
FROM x
WHERE rn = 1;
This subqery:
select distinct address_id, equipment_id,
min(datecreated) as datecreated, comment
from history where comment like 'MAC: 5%' group by equipment_id, address_id, comment
Probably will return multiple rows because the comment is not guaranteed to be the same.
Try this instead:
CROSS APPLY (
SELECT TOP 1 H2.DateCreated, H2.Comment -- H2.Equipment_id wasn't used
FROM History H2
WHERE
H2.Comment LIKE 'MAC: 5%'
AND C.Address_ID = H2.Address_ID
ORDER BY DateCreated
) H2
Switch that to OUTER APPLY in case you want rows that don't have a matching desired history entry.

Resources