I am new to query optimization,how to use semi join while implementing decorrelation I can't totally understand.
Consider the query
SELECT A, B
FROM r
WHERE r.B < SOME (
SELECT B
FROM s
WHERE s.A = r.A
)
Show how to decorrelate the above query using the multi-set version of
the semi-join operation
You may write your query using an inner join as follows:
SELECT DISTINCT r.A, r.B
FROM r
INNER JOIN s
ON r.A = s.A
WHERE r.B < s.B;
The DISTINCT clause is necessary in this version of your query, because a given record in the r table could potentially join to more than one match in the s table. In your original version, there can't be duplicates, because of the SOME clause which take a set of records any always returns a single yes/no answer.
Related
I know i am missing something ,my issue is, I have two tables with identical values except a filter and trying to join these temp tables in a SP but i am getting duplicate values.
Below is the sample code
SELECT DISTINCT
B.SUBSCRIBER_TAX_ID, B.MEMBER_FIRST_NAME, B.MEMBER_LAST_NAME,
B.BENEFIT_PLAN_NAME AS MEDICAL_PLAN, B.MEMBER_EFF_DATE AS MED_EFF_DATE, B.MEMBER_TERMINATION_DATE AS MED_END_DATE,
P.BENEFIT_PLAN_NAME AS PHARM_PLAN_NAME, P.MEMBER_EFF_DATE AS PHARM_EFF_DATE, P.MEMBER_TERMINATION_DATE AS PHARM_ENDdATE
FROM #BH_MED B
INNER JOIN #BH_PHARM P ON B.MEMBER_HCC_ID = P.MEMBER_HCC_ID
order by b.BENEFIT_PLAN_NAME,P.BENEFIT_PLAN_NAME
I want results as
!I want distinct abc,def in column 3 and column 6
Use group by
SELECT DISTINCT
B.SUBSCRIBER_TAX_ID, B.MEMBER_FIRST_NAME, B.MEMBER_LAST_NAME,
B.BENEFIT_PLAN_NAME AS MEDICAL_PLAN, B.MEMBER_EFF_DATE AS MED_EFF_DATE, B.MEMBER_TERMINATION_DATE AS MED_END_DATE,
P.BENEFIT_PLAN_NAME AS PHARM_PLAN_NAME, P.MEMBER_EFF_DATE AS PHARM_EFF_DATE, P.MEMBER_TERMINATION_DATE AS PHARM_ENDdATE
FROM #BH_MED B
INNER JOIN #BH_PHARM P ON B.MEMBER_HCC_ID = P.MEMBER_HCC_ID
GROUP BY B.SUBSCRIBER_TAX_ID, B.MEMBER_FIRST_NAME, B.MEMBER_LAST_NAME,B.BENEFIT_PLAN_NAME,B.MEMBER_EFF_DATE,B.MEMBER_TERMINATION_DATE,P.BENEFIT_PLAN_NAME,P.MEMBER_EFF_DATE,P.MEMBER_TERMINATION_DATE
order by b.BENEFIT_PLAN_NAME,P.BENEFIT_PLAN_NAME
In my project I have a merge join transformation, that uses inner join. It is supposed to join the files lookup with the rest of the data flow. However, the join seems to not include some rows, with files, even though it should? I'm trying to simulate the join in tsql, but I seem to be doing it wrong as it shows me the missing rows.
Here are the outputs I'm trying to join
Input A:
SELECT *
FROM
tblExpense expense
OUTER APPLY(
SELECT TOP 1 *
FROM tblExpenseDtl Details
WHERE expense.intExpenseID = Details.intExpenseID
ORDER BY Details.sintLineNo
) details
WHERE
expense.dtUpdateDateTime > '2017-06-01'
ORDER BY expense.intExpenseID desc
Input B:
SELECT f.*
FROM dbo.tblExpense e
JOIN tblExpenseDtl d ON d.intExpenseID = e.intExpenseID
JOIN tblExpReceiptFile f ON f.intExpenseDtlID = d.intExpenseDtlID
WHERE
e.dtUpdateDateTime > '2017-06-01'
ORDER BY e.intExpenseID desc
And the sql query that I thought would produce the same result as my SSIS inner join
SELECT *
FROM
tblExpense expense
OUTER APPLY(
SELECT TOP 1 *
FROM tblExpenseDtl Details
WHERE expense.intExpenseID = Details.intExpenseID
ORDER BY Details.sintLineNo
) details
inner join ( SELECT f.*
FROM dbo.tblExpense e
JOIN tblExpenseDtl d ON d.intExpenseID = e.intExpenseID
JOIN tblExpReceiptFile f ON f.intExpenseDtlID = d.intExpenseDtlID
WHERE
e.dtUpdateDateTime > '2017-06-01'
ORDER BY e.intExpenseID desc
) innerJ
WHERE
expense.dtUpdateDateTime > '2017-06-01'
ORDER BY expense.intExpenseID desc
The join key in the SSIS is the expense.intExpenseID = e.intExpenseID.
Input A gives 1 row, with an expenseID=X, and input B gives 2 rows with an expenseID=X
How are you sorting data before merging it? According to this SSIS is sorting in different way than SQL Server (in most cases). Maybe there is a problem.
Edit: What type is intExpenseID?
I have two tables one is the lookup table and the other is the data table. The lookup table has columns named cycleid, cycle. The data table has SID, cycleid, cycle. Below is the structure of the tables.
If you check the data table, the SID may have all the cycles and may not have all the cycles. I want to output the SID completed as well as missed cycles.
I right joined the lookup table and retrieved the missing as well as completed cycles. Below is the query I used.
SELECT TOP 1000 [SID]
,s4.[CYCLE]
,s4.[CYCLEID]
FROM [dbo].[data] s3 RIGHT JOIN
[dbo].[lookup_data] s4 ON s3.CYCLEID = s4.CYCLEID
The query is not displaying me the missed values when I query for all the SID's. When I specifically query for a SID with the below query i am getting the correct result including the missed ones.
SELECT TOP 1000 [SID]
,s4.[CYCLE]
,s4.[CYCLEID]
FROM [dbo].[data] s3 RIGHT JOIN [dbo].[lookup_data] s4
ON s3.CYCLEID = s4.CYCLEID
AND s3.SID = 101002
ORDER BY [SID], s4.[CYCLEID]
As I am supplying this query into tableau I cannot provide the sid value in the query. I want to return all the sid's and from tableau I will be do the rest of the things.
The expected output that i need is as shown below.
I wrote a cross join query like below to acheive my expected output
SELECT DISTINCT
tab.CYCLEID
,tab.SID
,d.CYCLE
FROM ( SELECT d.SID
,d.[CYCLE]
,e.CYCLEID
FROM ( SELECT e.sid
,e.CYCLE
FROM [db_temp].[dbo].[Sheet3$] e
) d
CROSS JOIN [db_temp].[dbo].[Sheet4$] e
) tab
LEFT OUTER JOIN [db_temp].[dbo].[Sheet3$] d
ON d.CYCLEID = tab.CYCLEID
AND d.SID = tab.SID
ORDER BY tab.SID
,tab.CYCLEID;
However I am not able to use this query for more scenarios as my data set have nearly 20 to 40 columns and i am having issues when i use the above one.
Is there any way to do this in a simpler manner with only left or right join itself? I want the query to return all the missing values and the completed values for the all the SID's instead of supplying a single sid in the query.
You can create a master table first (combine all SID and CYCLE ID), then right join with the data table
;with ctxMaster as (
select distinct d.SID, l.CYCLE, l.CYCLEID
from lookup_data l
cross join data d
)
select d.SID, m.CYCLE, m.CYCLEID
from ctxMaster m
left join data d on m.SID = d.SID and m.CYCLEID = d.CYCLEID
order by m.SID, m.CYCLEID
Fiddle
Or if you don't want to use common table expression, subquery version:
select d.SID, m.CYCLE, m.CYCLEID
from (select distinct d.SID, l.CYCLE, l.CYCLEID
from lookup_data l
cross join data d) m
left join data d on m.SID = d.SID and m.CYCLEID = d.CYCLEID
order by m.SID, m.CYCLEID
I have a complex query to retrieve some results:
EDITED QUERY (added the UNION ALL):
SELECT t.*
FROM (
SELECT
dbo.Intervencao.INT_Processo, analista,
ETS.ETS_Sigla, ATC.ATC_Sigla, PAT.PAT_Sigla, dbo.Assunto.SNT_Peso,
CASE
WHEN ETS.ETS_Sigla = 'PE' AND (PAT.PAT_Sigla = 'LIB' OR PAT.PAT_Sigla = 'LBR') THEN (0.3*SNT_Peso)
WHEN ETS.ETS_Sigla = 'CD' THEN (0.3*SNT_Peso)*0.3
ELSE SNT_Peso
END AS PESOAREA,
CASE
WHEN a.max_TEA_FimTarefa IS NULL THEN a.max_TEA_InicioTarefa
ELSE a.max_TEA_FimTarefa
END AS DATA_INICIO_TERMINO,
ROW_NUMBER() OVER (PARTITION BY ATC.ATC_Sigla, a.SRV_Id ORDER BY TEA_FimTarefa DESC) AS seqnum
FROM dbo.Tarefa AS t
INNER JOIN (
SELECT
MAX(dbo.TarefaEtapaAreaTecnica.TEA_InicioTarefa) AS max_TEA_InicioTarefa,
MAX (dbo.TarefaEtapaAreaTecnica.TEA_FimTarefa) AS max_TEA_FimTarefa,
dbo.Pessoa.PFJ_Descri AS analista, dbo.AreaTecnica.ATC_Id, dbo.Tarefa.SRV_Id
FROM dbo.TarefaEtapaAreaTecnica
LEFT JOIN dbo.Tarefa ON dbo.TarefaEtapaAreaTecnica.TRF_Id = dbo.Tarefa.TRF_Id
LEFT JOIN dbo.AreaTecnica ON dbo.TarefaEtapaAreaTecnica.ATC_Id = dbo.AreaTecnica.ATC_Id
LEFT JOIN dbo.ServicoAreaTecnica ON dbo.TarefaEtapaAreaTecnica.ATC_Id = dbo.ServicoAreaTecnica.ATC_Id
AND dbo.Tarefa.SRV_Id = dbo.ServicoAreaTecnica.SRV_Id
INNER JOIN dbo.Pessoa ON dbo.Pessoa.PFJ_Id = dbo.ServicoAreaTecnica.PFJ_Id_Analista
GROUP BY dbo.AreaTecnica.ATC_Id, dbo.Tarefa.SRV_Id, dbo.Pessoa.PFJ_Descri
) AS a ON t.SRV_Id = a.SRV_Id
INNER JOIN dbo.TarefaEtapaAreaTecnica AS TarefaEtapaAreaTecnica_1 ON
t.TRF_Id = TarefaEtapaAreaTecnica_1.TRF_Id
AND a.ATC_Id = TarefaEtapaAreaTecnica_1.ATC_Id
AND a.max_TEA_InicioTarefa = TarefaEtapaAreaTecnica_1.TEA_InicioTarefa
LEFT JOIN AreaTecnica ATC ON TarefaEtapaAreaTecnica_1.ATC_Id = ATC.ATC_Id
LEFT JOIN Etapa ETS ON TarefaEtapaAreaTecnica_1.ETS_Id = ETS.ETS_Id
LEFT JOIN ParecerTipo PAT ON TarefaEtapaAreaTecnica_1.PAT_Id = PAT.PAT_Id
LEFT OUTER JOIN dbo.Servico ON a.SRV_Id = dbo.Servico.SRV_Id
INNER JOIN dbo.Intervencao ON dbo.Servico.INT_Id = dbo.Intervencao.INT_Id
LEFT JOIN dbo.Assunto ON dbo.Servico.SNT_Id = dbo.Assunto.SNT_Id
) t
The result is following:
It works good, the problem is that I was asked that if when a row is not present on this query, it must contain values from another table (ServicoAreaTecnica), so I got this query for the other table based on crucial information of the first query. So if I UNION ALL I get this:
Query1 +
UNION ALL
SELECT INN.INT_Processo,
PES.PFJ_Descri,
NULL, --ETS.ETS_Sigla,
ART.ATC_Sigla,
NULL ,--PAT.PAT_Sigla,
ASS.SNT_Peso,
NULL, --PESOAREA
NULL, --DATA_INICIO_TERMINO
NULL --seqnum
FROM dbo.ServicoAreaTecnica AS SAT
INNER JOIN dbo.AreaTecnica AS ART ON ART.ATC_Id = SAT.ATC_Id
INNER JOIN dbo.Servico AS SER ON SER.SRV_Id = SAT.SRV_Id
INNER JOIN dbo.Assunto AS ASS ON ASS.SNT_Id = SER.SNT_Id
INNER JOIN dbo.Intervencao AS INN ON INN.INT_Id = SER.INT_Id
INNER JOIN dbo.Pessoa AS PES ON PES.PFJ_Id = SAT.PFJ_Id_Analista
The result is following:
So what I want to do is to remove row number 1 because row number 2 exists on the first query, I think I got it explained better this time. The result should be only row number 1, row number 2 would appear only if query 1 doesn't retrieve a row for that particular INN.INT_Processo.
Thanks!
Ok, there are two ways to reduce your record set. Given that you've already written the code to produce the table with the extra rows, it might be easiest to just add code to reduce that:
Select * from
(Select *
, Row_Number() over
(partition by IntProcesso, Analista order by ISNULL(seqnum, 0) desc) as RN
from MyResults) a
where RN = 1
This will assign row_number 1 to any rows that came from your first query, or to any rows from the second query that do not have matches in the first query, then filter out extra rows.
You could also use outer joins with isnull or coalesce, as others have suggested. Something like this:
Select ISNULL(a.IntProcesso, b.IntProcesso) as IntProcesso
, ISNULL(a.Analista, b.Analista) as Analista
, ISNULL(a.ETSsigla, b.ETSsigla) as ETSsigla
[repeat for the rest of your columns]
from Table1 a
full outer join Table2 b
on a.IntProcesso = b.IntProcesso and a.Analista = b.Analista
Your code is hard to read, because of the lengthy names of everything (and to be honest, the fact that they're in a language I don't speak also makes it a lot harder).
But how about: replacing your INNER JOINs with LEFT JOINs, adding more LEFT JOINs to draw in the alternative tables, and introducing ISNULL clauses for each variable you want in the results?
If you do something like ... Query1 Right Join Query2 On ... that should get only the rows in Query2 that don't appear in Query 1.
Dear friends, below are my two SQL queries:
select distinct
a_bm.DestProvider_ID,
a_bm.DestCircel_ID,
convert(datetime,dbo.fnToDate(a_bm.BM_BillFrom),103) as fromdate,
convert(datetime,dbo.fnToDate(a_bm.BM_BillTo),103) as todate,
t_rec.TapInRec as BillRecevable,
t_rec.TapInRec as Billreceied
from Auditdata_BillingMaster a_bm
inner join TapInRecordMaster t_rec
on a_bm.DestProvider_ID = t_rec.DestProviderMaster_ID
and a_bm.DestCircel_ID = t_rec.DestCircelMaster_ID
and convert(datetime,dbo.fnToDate(a_bm.BM_BillFrom),103)> =
convert(datetime,t_rec.Months)
and convert(datetime,dbo.fnToDate(a_bm.BM_BillTo),103)<=
convert(datetime,t_rec.BillTo)
where a_bm.DestProvider_ID=4
and a_bm.DestCircel_ID=22
and a_bm.typeoffile=1
and convert(datetime,dbo.fnToDate(a_bm.BM_BillFrom),103)>=
convert(datetime,'6/1/2009')
and convert(datetime,dbo.fnToDate(a_bm.BM_BillFrom),103)<=
convert(datetime,'7/30/2009')
select Temp_tbl.fromdate from Temp_tbl Temp_tbl
inner join (
select
convert(datetime,dbo.fnToDate(BM_BillFrom),103) as a1,
convert(datetime,dbo.fnToDate(BM_BillTo),103) as b1,
count(*) as c1,
am_bm.DestProvider_ID,
am_bm.DestCircel_ID
from Auditdata_BillingMaster am_bm
inner join Temp_tbl tmp
on tmp.Provider_ID=am_bm.DestProvider_ID
and tmp.Circel_ID=am_bm.DestCircel_ID
where convert(datetime,tmp.fromdate)>=
convert(datetime,dbo.fnToDate(am_bm.BM_BillFrom),103)
and convert(datetime,tmp.todate) <=
convert(datetime,dbo.fnToDate(am_bm.BM_BillTo),103)
group by
convert(datetime,dbo.fnToDate(BM_BillFrom),103),
convert(datetime,dbo.fnToDate(BM_BillTo),103),
am_bm.DestProvider_ID,
am_bm.DestCircel_ID
) b
on Temp_tbl.Provider_ID = b.DestProvider_ID
and Temp_tbl.Circel_ID = b.DestCircel_ID
and convert(datetime,Temp_tbl.fromdate,101)>= convert(datetime,(b.a1),101)
and convert(datetime,Temp_tbl.todate) <= convert(datetime,(b.b1),101)
I want to merge above 2 SQL query in SQL Server 2000.
Please help me.
Thanks in advance.
Do you mean to JOIN or UNION both tables?
If you mean to JOIN both query results, simply take both results as input for JOIN statement.
How you join both results is really dependent on your database design. Preferably the join is based on referential integrity enforcing the relationship between the results to ensure data integrity. But since you do not mention the join condition, let me assume you will join based on DestProvider_ID & DestCircel_ID.
select
result1.DestProvider_ID,
result1.DestCircel_ID,
result1.fromdate,
result1.todate,
result1.BillRecevable,
result1.Billreceied,
result2.fromdate
from
( *your first query* ) as result1
inner join
(select
Temp_tbl.fromdate,
am_bm.DestProvider_ID,
am_bm.DestCircel_ID
from Temp_tbl Temp_tbl
*the rest of your second query*
) as result2 on result1.DestProvider_ID = result2.DestProvider_ID
and result1.DestCircel_ID = result2.DestCircel_ID
UNION:
If you want to take multiple select statements and combine them into one result set, UNION statement is the easiest way to go:
SELECT column1a, column2a, column3a FROM tableA
UNION
SELECT column1b, column2b, column3b FROM tableB
This is possible only if:
both queries have same number of columns
Corresponding columns in each query expression must be of the same data type
data type of column1a == column1b
data type of column2a == column2b
data type of column3a == column3b
Since both of your queries do not have same number of columns, you can't merge them, at least with UNION select.