Get specific columns from join without join syntax? - sql-server

Is there another way to write this?
SELECT src.ID, factDeviceBuild.ID
FROM #factDeviceBuild as src
INNER JOIN AppsFlyer.FactDeviceBuild AS factDeviceBuild
ON src.[DimDevice_Id] = factDeviceBuild.[DimDevice_Id] AND
src.[DimDeviceModel_Id] = factDeviceBuild.[DimDeviceModel_Id] AND
src.[DimPlatform_Id] = factDeviceBuild.[DimPlatform_Id] AND
src.[DimOSVersion_Id] = factDeviceBuild.[DimOSVersion_Id] AND
src.[DimSDKVersion_Id] = factDeviceBuild.[DimSDKVersion_Id] AND
src.[DimCarrier_Id] = factDeviceBuild.[DimCarrier_Id] AND
src.[DimOperator_Id] = factDeviceBuild.[DimOperator_Id]
I've been trying to do some different things (that don't work) like this
SELECT *, factDeviceBuild.ID
FROM #factDeviceBuild
WHERE EXISTS (
SELECT [DimDevice_Id], [DimDeviceModel_Id], [DimPlatform_Id],
[DimOSVersion_Id], [DimSDKVersion_Id], [DimCarrier_Id],
[DimOperator_Id]
FROM AppsFlyer.FactDeviceBuild AS factDeviceBuild
)
or like this (also doesn't work):
SELECT factDeviceBuild.ID,
factDeviceBuild.[ID]
FROM (
SELECT [DimDevice_Id], [DimDeviceModel_Id], [DimPlatform_Id],
[DimOSVersion_Id], [DimSDKVersion_Id], [DimCarrier_Id],
[DimOperator_Id]
FROM AppsFlyer.FactDeviceBuild AS factDeviceBuild
INTERSECT
SELECT [DimDevice_Id], [DimDeviceModel_Id], [DimPlatform_Id],
[DimOSVersion_Id], [DimSDKVersion_Id], [DimCarrier_Id],
[DimOperator_Id]
FROM AppsFlyer.#factDeviceBuild AS factDeviceBuild
) AS A
I'm just playing around with some query tuning. EXCEPT and INTERSECT are particularly interesting because of the way they treat NULLS.
Obviously I could use a CROSS JOIN or OUTER JOIN to construct my INNER JOIN form scratch, but I don't see a particular gain there.

I believe you are looking for something like this:
SELECT src.ID, fact.ID
FROM #factDeviceBuild as src
INNER JOIN AppsFlyer.FactDeviceBuild AS fact
ON EXISTS (
SELECT src.DimDevice_Id, src.DimDeviceModel_Id, src.DimPlatform_Id,
src.DimOSVersion_Id, src.DimSDKVersion_Id, src.DimCarrier_Id,
src.DimOperator_Id
INTERSECT
SELECT fact.DimDevice_Id, fact.DimDeviceModel_Id, fact.DimPlatform_Id,
fact.DimOSVersion_Id, fact.DimSDKVersion_Id, fact.DimCarrier_Id,
fact.DimOperator_Id
)
Using this INTERSECT syntax (instead of the usual conditions) has the advantage of treating NULL-s as the same values. For example, if only the DimCarrier_Id and DimOperator_Id columns would allow NULL-s, the equivalent condition would need be:
SELECT src.ID, fact.ID
FROM #factDeviceBuild as src
INNER JOIN AppsFlyer.FactDeviceBuild AS fact
ON src.DimDevice_Id = fact.DimDevice_Id AND
src.DimDeviceModel_Id = fact.DimDeviceModel_Id AND
src.DimPlatform_Id = fact.DimPlatform_Id AND
src.DimOSVersion_Id = fact.DimOSVersion_Id AND
src.DimSDKVersion_Id = fact.DimSDKVersion_Id AND
(src.DimCarrier_Id = fact.DimCarrier_Id OR src.DimCarrier_Id IS NULL AND fact.DimCarrier_Id IS NULL) AND
(src.DimOperator_Id = fact.DimOperator_Id OR src.DimOperator_Id IS NULL AND fact.DimOperator_Id IS NULL)

Following is same
SELECT src.ID, factDeviceBuild.ID
FROM #factDeviceBuild as src, AppsFlyer.FactDeviceBuild AS factDeviceBuild
WHERE
src.[DimDevice_Id] = factDeviceBuild.[DimDevice_Id] AND
src.[DimDeviceModel_Id] = factDeviceBuild.[DimDeviceModel_Id] AND
src.[DimPlatform_Id] = factDeviceBuild.[DimPlatform_Id] AND
src.[DimOSVersion_Id] = factDeviceBuild.[DimOSVersion_Id] AND
src.[DimSDKVersion_Id] = factDeviceBuild.[DimSDKVersion_Id] AND
src.[DimCarrier_Id] = factDeviceBuild.[DimCarrier_Id] AND
src.[DimOperator_Id] = factDeviceBuild.[DimOperator_Id]

Without either data or a visualization of the expected result, my guess is you need to "unpivot" the 7 id types into less columns, which reduces the join syntax complexity. e.g.:
select
src.id, f.fact_id, ca.id_type, ca.id_value
from #factDeviceBuild as src
cross apply (
values
('DimDevice_Id',src.[DimDevice_Id])
,('DimDeviceModel_Id',src.[DimDeviceModel_Id])
,('DimPlatform_Id',src.[DimPlatform_Id])
,('DimOSVersion_Id',src.[DimOSVersion_Id])
,('DimSDKVersion_Id',src.[DimSDKVersion_Id])
,('DimCarrier_Id',src.[DimCarrier_Id])
,('DimOperator_Id',src.[DimOperator_Id])
) ca (id_type, id_value)
inner join (
select
fact.id fact_id, ca.id_type, ca.id_value
from AppsFlyer.FactDeviceBuild AS fact
cross apply (
values
('DimDevice_Id',fact.[DimDevice_Id])
,('DimDeviceModel_Id',fact.[DimDeviceModel_Id])
,('DimPlatform_Id',fact.[DimPlatform_Id])
,('DimOSVersion_Id',fact.[DimOSVersion_Id])
,('DimSDKVersion_Id',fact.[DimSDKVersion_Id])
,('DimCarrier_Id',fact.[DimCarrier_Id])
,('DimOperator_Id',fact.[DimOperator_Id])
) ca (id_type, id_value)
where ca.id_value IS NOT NULL
) as f on ca.id_type = f.id_type and ca.id_value = f.id_value
Note I have not used the "unpivot" feature of TSQL as I prefer the syntax you see above. There is NO additional performance disadvantage when using this apply/values syntax.
NB: all 7 of those id type columns must be "compatible" data types for the "unpivot" to work without error. All 7 as integer for example, which would make the id_value a column of integers.

Related

how to use this without subquery. I need to use with join to get same result set

How to use this below query as join instead of subquery. It's resulting poor performance
SELECT EBIJ.* FROM BUDLINEITEMS EBIJ
WHERE ReferenceId NOT IN (SELECT ImportKeyId FROM External_Blk_Itm_JounalEntries)
SELECT EBIJ.*
FROM BUDLINEITEMS EBIJ
LEFT JOIN External_Blk_Itm_JounalEntries E
ON EBIJ.ReferenceId = E.ImportKeyId
WHERE E.ImportKeyId IS NULL
OR
SELECT EBIJ.* FROM BUDLINEITEMS EBIJ
WHERE NOT EXISTS (SELECT 1
FROM External_Blk_Itm_JounalEntries E
WHERE EBIJ.ReferenceId = E.ImportKeyId )

How to join one select with another when the first one not always returns a value for specific row?

I have a complex query to retrieve some results:
EDITED QUERY (added the UNION ALL):
SELECT t.*
FROM (
SELECT
dbo.Intervencao.INT_Processo, analista,
ETS.ETS_Sigla, ATC.ATC_Sigla, PAT.PAT_Sigla, dbo.Assunto.SNT_Peso,
CASE
WHEN ETS.ETS_Sigla = 'PE' AND (PAT.PAT_Sigla = 'LIB' OR PAT.PAT_Sigla = 'LBR') THEN (0.3*SNT_Peso)
WHEN ETS.ETS_Sigla = 'CD' THEN (0.3*SNT_Peso)*0.3
ELSE SNT_Peso
END AS PESOAREA,
CASE
WHEN a.max_TEA_FimTarefa IS NULL THEN a.max_TEA_InicioTarefa
ELSE a.max_TEA_FimTarefa
END AS DATA_INICIO_TERMINO,
ROW_NUMBER() OVER (PARTITION BY ATC.ATC_Sigla, a.SRV_Id ORDER BY TEA_FimTarefa DESC) AS seqnum
FROM dbo.Tarefa AS t
INNER JOIN (
SELECT
MAX(dbo.TarefaEtapaAreaTecnica.TEA_InicioTarefa) AS max_TEA_InicioTarefa,
MAX (dbo.TarefaEtapaAreaTecnica.TEA_FimTarefa) AS max_TEA_FimTarefa,
dbo.Pessoa.PFJ_Descri AS analista, dbo.AreaTecnica.ATC_Id, dbo.Tarefa.SRV_Id
FROM dbo.TarefaEtapaAreaTecnica
LEFT JOIN dbo.Tarefa ON dbo.TarefaEtapaAreaTecnica.TRF_Id = dbo.Tarefa.TRF_Id
LEFT JOIN dbo.AreaTecnica ON dbo.TarefaEtapaAreaTecnica.ATC_Id = dbo.AreaTecnica.ATC_Id
LEFT JOIN dbo.ServicoAreaTecnica ON dbo.TarefaEtapaAreaTecnica.ATC_Id = dbo.ServicoAreaTecnica.ATC_Id
AND dbo.Tarefa.SRV_Id = dbo.ServicoAreaTecnica.SRV_Id
INNER JOIN dbo.Pessoa ON dbo.Pessoa.PFJ_Id = dbo.ServicoAreaTecnica.PFJ_Id_Analista
GROUP BY dbo.AreaTecnica.ATC_Id, dbo.Tarefa.SRV_Id, dbo.Pessoa.PFJ_Descri
) AS a ON t.SRV_Id = a.SRV_Id
INNER JOIN dbo.TarefaEtapaAreaTecnica AS TarefaEtapaAreaTecnica_1 ON
t.TRF_Id = TarefaEtapaAreaTecnica_1.TRF_Id
AND a.ATC_Id = TarefaEtapaAreaTecnica_1.ATC_Id
AND a.max_TEA_InicioTarefa = TarefaEtapaAreaTecnica_1.TEA_InicioTarefa
LEFT JOIN AreaTecnica ATC ON TarefaEtapaAreaTecnica_1.ATC_Id = ATC.ATC_Id
LEFT JOIN Etapa ETS ON TarefaEtapaAreaTecnica_1.ETS_Id = ETS.ETS_Id
LEFT JOIN ParecerTipo PAT ON TarefaEtapaAreaTecnica_1.PAT_Id = PAT.PAT_Id
LEFT OUTER JOIN dbo.Servico ON a.SRV_Id = dbo.Servico.SRV_Id
INNER JOIN dbo.Intervencao ON dbo.Servico.INT_Id = dbo.Intervencao.INT_Id
LEFT JOIN dbo.Assunto ON dbo.Servico.SNT_Id = dbo.Assunto.SNT_Id
) t
The result is following:
It works good, the problem is that I was asked that if when a row is not present on this query, it must contain values from another table (ServicoAreaTecnica), so I got this query for the other table based on crucial information of the first query. So if I UNION ALL I get this:
Query1 +
UNION ALL
SELECT INN.INT_Processo,
PES.PFJ_Descri,
NULL, --ETS.ETS_Sigla,
ART.ATC_Sigla,
NULL ,--PAT.PAT_Sigla,
ASS.SNT_Peso,
NULL, --PESOAREA
NULL, --DATA_INICIO_TERMINO
NULL --seqnum
FROM dbo.ServicoAreaTecnica AS SAT
INNER JOIN dbo.AreaTecnica AS ART ON ART.ATC_Id = SAT.ATC_Id
INNER JOIN dbo.Servico AS SER ON SER.SRV_Id = SAT.SRV_Id
INNER JOIN dbo.Assunto AS ASS ON ASS.SNT_Id = SER.SNT_Id
INNER JOIN dbo.Intervencao AS INN ON INN.INT_Id = SER.INT_Id
INNER JOIN dbo.Pessoa AS PES ON PES.PFJ_Id = SAT.PFJ_Id_Analista
The result is following:
So what I want to do is to remove row number 1 because row number 2 exists on the first query, I think I got it explained better this time. The result should be only row number 1, row number 2 would appear only if query 1 doesn't retrieve a row for that particular INN.INT_Processo.
Thanks!
Ok, there are two ways to reduce your record set. Given that you've already written the code to produce the table with the extra rows, it might be easiest to just add code to reduce that:
Select * from
(Select *
, Row_Number() over
(partition by IntProcesso, Analista order by ISNULL(seqnum, 0) desc) as RN
from MyResults) a
where RN = 1
This will assign row_number 1 to any rows that came from your first query, or to any rows from the second query that do not have matches in the first query, then filter out extra rows.
You could also use outer joins with isnull or coalesce, as others have suggested. Something like this:
Select ISNULL(a.IntProcesso, b.IntProcesso) as IntProcesso
, ISNULL(a.Analista, b.Analista) as Analista
, ISNULL(a.ETSsigla, b.ETSsigla) as ETSsigla
[repeat for the rest of your columns]
from Table1 a
full outer join Table2 b
on a.IntProcesso = b.IntProcesso and a.Analista = b.Analista
Your code is hard to read, because of the lengthy names of everything (and to be honest, the fact that they're in a language I don't speak also makes it a lot harder).
But how about: replacing your INNER JOINs with LEFT JOINs, adding more LEFT JOINs to draw in the alternative tables, and introducing ISNULL clauses for each variable you want in the results?
If you do something like ... Query1 Right Join Query2 On ... that should get only the rows in Query2 that don't appear in Query 1.

Conditional JOIN Statement SQL Server

Is it possible to do the following:
IF [a] = 1234 THEN JOIN ON TableA
ELSE JOIN ON TableB
If so, what is the correct syntax?
I think what you are asking for will work by joining the Initial table to both Option_A and Option_B using LEFT JOIN, which will produce something like this:
Initial LEFT JOIN Option_A LEFT JOIN NULL
OR
Initial LEFT JOIN NULL LEFT JOIN Option_B
Example code:
SELECT i.*, COALESCE(a.id, b.id) as Option_Id, COALESCE(a.name, b.name) as Option_Name
FROM Initial_Table i
LEFT JOIN Option_A_Table a ON a.initial_id = i.id AND i.special_value = 1234
LEFT JOIN Option_B_Table b ON b.initial_id = i.id AND i.special_value <> 1234
Once you have done this, you 'ignore' the set of NULLS. The additional trick here is in the SELECT line, where you need to decide what to do with the NULL fields. If the Option_A and Option_B tables are similar, then you can use the COALESCE function to return the first NON NULL value (as per the example).
The other option is that you will simply have to list the Option_A fields and the Option_B fields, and let whatever is using the ResultSet to handle determining which fields to use.
This is just to add the point that query can be constructed dynamically based on conditions.
An example is given below.
DECLARE #a INT = 1235
DECLARE #sql VARCHAR(MAX) = 'SELECT * FROM [sourceTable] S JOIN ' + IIF(#a = 1234,'[TableA] A ON A.col = S.col','[TableB] B ON B.col = S.col')
EXEC(#sql)
--Query will be
/*
SELECT * FROM [sourceTable] S JOIN [TableB] B ON B.col = S.col
*/
You can solve this with union
select a, b
from tablea
join tableb on tablea.a = tableb.a
where b = 1234
union
select a, b
from tablea
join tablec on tablec.a = tableb.a
where b <> 1234
I disagree with the solution suggesting 2 left joins. I think a table-valued function is more appropriate so you don't have all the coalescing and additional joins for each condition you would have.
CREATE FUNCTION f_GetData (
#Logic VARCHAR(50)
) RETURNS #Results TABLE (
Content VARCHAR(100)
) AS
BEGIN
IF #Logic = '1234'
INSERT #Results
SELECT Content
FROM Table_1
ELSE
INSERT #Results
SELECT Content
FROM Table_2
RETURN
END
GO
SELECT *
FROM InputTable
CROSS APPLY f_GetData(InputTable.Logic) T
I think it will be better to think about your query in a different way and treat them more like sets.
I do believe if you make two separate queries then join them using UNION, It will be much better in performance and more readable.

verify that xml node has a child node with a given value tsql

I have the following tables
A (ID, relatedID, typeId )
B (ID, leftID, leftTypeId)
I want to join the two tables like this
select * from A
inner join B on A.TypeId=B.LeftTypeId and {condition}
where condition should verify id the leftID would match a value from relatedID, where relatedId is a xml column. Eg. relatedID=<Id>1</Id>
Is there a optimal way to do this?
UPDATE
relatedID can contain several Ids. Eg Eg. relatedID=<Id>1</Id><Id>2</Id>
You may use
... and A.relatedID.value('(/Id[1]/text())[1]', 'int') = B.leftID
or
... and A.relatedID.exist('(/Id[1]/text())[1] = sql:column("B.leftID")') = 1
Though exist is recommended over value for predicates, depending on whether the XML column is xml-indexed or not and what type of indexes it has, one of the two above may perform better.
upd. for the case when relatedID can contain set of Ids you may try
select ...
from A
cross apply A.relatedID.nodes('/Id') r(id)
inner join B on A.TypeId=B.LeftTypeId
and r.id.value('text()[1]', 'int') = B.leftID
or
select ...
from A
cross apply A.relatedID.nodes('/Id') r(id)
inner join B on A.TypeId=B.LeftTypeId
and r.id.exist('text()[1]=sql:column("B.leftID")') = 1
or even
select ...
from A
inner join B on A.TypeId=B.LeftTypeId
and A.relatedID.exist('/Id[text()[1]=sql:column("B.leftID")]') = 1

How do I sum the results of a select that returns multiple rows

I have a SQL variable #SumScore dec(9,4)
I am trying to assign the variable as follows:
SET #SumScore =
(
SELECT Sum(
(
SELECT SUM(etjs.CalculatedScore * sc.PercentOfTotal) as CategoryScore
FROM tblEventTurnJudgeScores etjs
INNER JOIN tblJudgingCriteria jc ON jc.JudgingCriteriaID = etjs.JudgingCriteriaID
INNER JOIN tblScoringCategories sc ON jc.ScoringCategoryID = sc.ScoringCategoryID
GROUP BY jc.JudgingCriteriaID
)
As ComputedScore) AS SumTotalScore
)
In other words the inner select is returning one column. I want the var to be assigned the SUM of all of the rows that are being return there.
I realize that this could be done with a temp table pretty easily. But is that the only way?
SELECT Sum(CategoryScore)
FROM ( subquery )
Use:
SET #SumScore = SELECT SUM(etjs.CalculatedScore * sc.PercentOfTotal) as CategoryScore
FROM tblEventTurnJudgeScores etjs
JOIN tblJudgingCriteria jc ON jc.JudgingCriteriaID = etjs.JudgingCriteriaID
JOIN tblScoringCategories sc ON jc.ScoringCategoryID = sc.ScoringCategoryID
There's no point to using GROUP BY jc.JudgingCriteriaID if the outer query is going to sum up everything anyway.
This worked for me like this:
select sum(myColumn) from MyTable where MyTableID = 'some value'
you could also do this (to make it more robust):
select sum(isnull(myColumn,0)) from MyTable where MyTableID = 'some value'

Resources