How to write sql for this logic? - sql-server

I have the following tables (this is abbreviated)
tableAssignments
id
timestamp // updated when row is modified
tableSystems
tableUsers
...
tableAssignmentsSnapshot
id
timestamp // set when row created, snapshots never change
tableSystemsSnapshot
tableUsersSnapshot
tableAssignments_id // reference to "tableAssignments" row this was created from
syncKey // Changed when snapshots are created, max value is latest
Sometime after the tableAssignments are changed, snapshots of all the assignments are taken and the snapshots are created with the same syncKey. I am trying to write a query to check if the latest "tableAssignmentsSnapshot" are up to date or if new ones should be taken. The logic required is:
Determine the latest tableAssignmentSnapshots for each system, which is determine by maximum syncKey (see query below).
Join the result from #1 with tableAssignments.
If timestamp from tableAssignments > timestamp from tableAssignmentsSnapshot then new snapshot is required.
The following query accomplishes step #1 and runs fast. It returns the tableAssignmentSnapshots with the greatest value of "syncKey" for each system.
select * from
tableAssignmentsSnapshot d inner join tableSystemsSnapshot e
on d.systemId = e.id
where d.syncKey = (select MAX(syncKey) from tableAssignmentsSnapshot y inner join tableSystemsSnapshot z
on y.systemId = z.id
where tableSystems_id = e.tableSystems_id)
I'm struggling with what to do next. I need to be able to do something like the following but am having trouble with the syntax required:
select * from tableAssignments a right outer join
(
the result from above query
)
on a.id = ?.tableAssignments_id

You could wrap your first query on a CTE:
;WITH CTE AS
(
SELECT * -- Here you should list only the columns that you are gonna need
FROM tableAssignmentsSnapshot D
INNER JOIN tableSystemsSnapshot E
ON D.systemId = E.id
WHERE D.syncKey = ( select MAX(syncKey) from tableAssignmentsSnapshot y inner join tableSystemsSnapshot z
on y.systemId = z.id
where tableSystems_id = e.tableSystems_id)
)
SELECT *
FROM tableAssignments A
RIGHT JOIN CTE B
ON A.id = B.tableAssignments_id

Related

SSIS merge join lacks row (and also How to simulate SSIS join with tsql query)

In my project I have a merge join transformation, that uses inner join. It is supposed to join the files lookup with the rest of the data flow. However, the join seems to not include some rows, with files, even though it should? I'm trying to simulate the join in tsql, but I seem to be doing it wrong as it shows me the missing rows.
Here are the outputs I'm trying to join
Input A:
SELECT *
FROM
tblExpense expense
OUTER APPLY(
SELECT TOP 1 *
FROM tblExpenseDtl Details
WHERE expense.intExpenseID = Details.intExpenseID
ORDER BY Details.sintLineNo
) details
WHERE
expense.dtUpdateDateTime > '2017-06-01'
ORDER BY expense.intExpenseID desc
Input B:
SELECT f.*
FROM dbo.tblExpense e
JOIN tblExpenseDtl d ON d.intExpenseID = e.intExpenseID
JOIN tblExpReceiptFile f ON f.intExpenseDtlID = d.intExpenseDtlID
WHERE
e.dtUpdateDateTime > '2017-06-01'
ORDER BY e.intExpenseID desc
And the sql query that I thought would produce the same result as my SSIS inner join
SELECT *
FROM
tblExpense expense
OUTER APPLY(
SELECT TOP 1 *
FROM tblExpenseDtl Details
WHERE expense.intExpenseID = Details.intExpenseID
ORDER BY Details.sintLineNo
) details
inner join ( SELECT f.*
FROM dbo.tblExpense e
JOIN tblExpenseDtl d ON d.intExpenseID = e.intExpenseID
JOIN tblExpReceiptFile f ON f.intExpenseDtlID = d.intExpenseDtlID
WHERE
e.dtUpdateDateTime > '2017-06-01'
ORDER BY e.intExpenseID desc
) innerJ
WHERE
expense.dtUpdateDateTime > '2017-06-01'
ORDER BY expense.intExpenseID desc
The join key in the SSIS is the expense.intExpenseID = e.intExpenseID.
Input A gives 1 row, with an expenseID=X, and input B gives 2 rows with an expenseID=X
How are you sorting data before merging it? According to this SSIS is sorting in different way than SQL Server (in most cases). Maybe there is a problem.
Edit: What type is intExpenseID?

RIGHT\LEFT Join does not provide null values without condition

I have two tables one is the lookup table and the other is the data table. The lookup table has columns named cycleid, cycle. The data table has SID, cycleid, cycle. Below is the structure of the tables.
If you check the data table, the SID may have all the cycles and may not have all the cycles. I want to output the SID completed as well as missed cycles.
I right joined the lookup table and retrieved the missing as well as completed cycles. Below is the query I used.
SELECT TOP 1000 [SID]
,s4.[CYCLE]
,s4.[CYCLEID]
FROM [dbo].[data] s3 RIGHT JOIN
[dbo].[lookup_data] s4 ON s3.CYCLEID = s4.CYCLEID
The query is not displaying me the missed values when I query for all the SID's. When I specifically query for a SID with the below query i am getting the correct result including the missed ones.
SELECT TOP 1000 [SID]
,s4.[CYCLE]
,s4.[CYCLEID]
FROM [dbo].[data] s3 RIGHT JOIN [dbo].[lookup_data] s4
ON s3.CYCLEID = s4.CYCLEID
AND s3.SID = 101002
ORDER BY [SID], s4.[CYCLEID]
As I am supplying this query into tableau I cannot provide the sid value in the query. I want to return all the sid's and from tableau I will be do the rest of the things.
The expected output that i need is as shown below.
I wrote a cross join query like below to acheive my expected output
SELECT DISTINCT
tab.CYCLEID
,tab.SID
,d.CYCLE
FROM ( SELECT d.SID
,d.[CYCLE]
,e.CYCLEID
FROM ( SELECT e.sid
,e.CYCLE
FROM [db_temp].[dbo].[Sheet3$] e
) d
CROSS JOIN [db_temp].[dbo].[Sheet4$] e
) tab
LEFT OUTER JOIN [db_temp].[dbo].[Sheet3$] d
ON d.CYCLEID = tab.CYCLEID
AND d.SID = tab.SID
ORDER BY tab.SID
,tab.CYCLEID;
However I am not able to use this query for more scenarios as my data set have nearly 20 to 40 columns and i am having issues when i use the above one.
Is there any way to do this in a simpler manner with only left or right join itself? I want the query to return all the missing values and the completed values for the all the SID's instead of supplying a single sid in the query.
You can create a master table first (combine all SID and CYCLE ID), then right join with the data table
;with ctxMaster as (
select distinct d.SID, l.CYCLE, l.CYCLEID
from lookup_data l
cross join data d
)
select d.SID, m.CYCLE, m.CYCLEID
from ctxMaster m
left join data d on m.SID = d.SID and m.CYCLEID = d.CYCLEID
order by m.SID, m.CYCLEID
Fiddle
Or if you don't want to use common table expression, subquery version:
select d.SID, m.CYCLE, m.CYCLEID
from (select distinct d.SID, l.CYCLE, l.CYCLEID
from lookup_data l
cross join data d) m
left join data d on m.SID = d.SID and m.CYCLEID = d.CYCLEID
order by m.SID, m.CYCLEID

SQL Server 2008: Only one result, joining 3 tables

I have the following databases:
IMAGE UPDATED
SELECT
l.FECHA, nc.IdNumeroControlTrabajador AS ENLACE,
t.RFC, t.Homonimia AS HOM,
t.ApellidoPaterno, t.ApellidoMaterno,
t.Nombre, l.IMPORTE,
cp.NombreOrgano AS DEPENDENCIA, l.OBSERVACIONES,
l.FECHA_INICIAL, l.FECHA_FINAL,
MAX (cn.FechaIngresoGobierno)
FROM
CapturaFAIFAP.dbo.Laudos AS l
LEFT JOIN FAIFAPparalelo.dbo.NumeroControlTrabajadores AS nc ON l.ID = nc.idTrabajador
LEFT JOIN FAIFAPparalelo.dbo.ClaveNominalTrabajadores AS cn ON l.ID = cn.idTrabajador
LEFT JOIN FAIFAPparalelo.dbo.Trabajadores AS t ON l.ID = t.idTrabajador
LEFT JOIN FAIFAPparalelo.dbo.ClaveProgramatica AS cp ON cn.idClaveProgramatica = cp.idClaveProgramatica
LEFT JOIN (
SELECT
cp.NombreOrgano,
MAX (cnt.FechaIngresoGobierno) AS fecha
FROM
FAIFAPparalelo.dbo.ClaveNominalTrabajadores as cnt
GROUP BY
cp.NombreOrgano
) AS j ON (
cp.NombreOrgano = j.NombreOrgano
AND j.fecha = cn.FechaIngresoGobierno
)
GROUP BY
l.FECHA, nc.IdNumeroControlTrabajador, t.RFC,
t.Homonimia, t.ApellidoPaterno, t.ApellidoMaterno,
t.Nombre,l.IMPORTE,cp.NombreOrgano,l.OBSERVACIONES,
l.FECHA_INICIAL,l.FECHA_FINAL,cn.FechaIngresoGobierno
I need to unify the query. The huge query i put in code, and a similar query in the image. If i join the tables ClaveNominal and ClaveProgramatica i have duplicities, because sometimes the persons have 2 ClavesProgramaticas, due to the system I use, not delete or update the current registry, just puts in inactive and adds a new one active, and i can't make the condition to activo because, that represents if the person still works here.

How to join one select with another when the first one not always returns a value for specific row?

I have a complex query to retrieve some results:
EDITED QUERY (added the UNION ALL):
SELECT t.*
FROM (
SELECT
dbo.Intervencao.INT_Processo, analista,
ETS.ETS_Sigla, ATC.ATC_Sigla, PAT.PAT_Sigla, dbo.Assunto.SNT_Peso,
CASE
WHEN ETS.ETS_Sigla = 'PE' AND (PAT.PAT_Sigla = 'LIB' OR PAT.PAT_Sigla = 'LBR') THEN (0.3*SNT_Peso)
WHEN ETS.ETS_Sigla = 'CD' THEN (0.3*SNT_Peso)*0.3
ELSE SNT_Peso
END AS PESOAREA,
CASE
WHEN a.max_TEA_FimTarefa IS NULL THEN a.max_TEA_InicioTarefa
ELSE a.max_TEA_FimTarefa
END AS DATA_INICIO_TERMINO,
ROW_NUMBER() OVER (PARTITION BY ATC.ATC_Sigla, a.SRV_Id ORDER BY TEA_FimTarefa DESC) AS seqnum
FROM dbo.Tarefa AS t
INNER JOIN (
SELECT
MAX(dbo.TarefaEtapaAreaTecnica.TEA_InicioTarefa) AS max_TEA_InicioTarefa,
MAX (dbo.TarefaEtapaAreaTecnica.TEA_FimTarefa) AS max_TEA_FimTarefa,
dbo.Pessoa.PFJ_Descri AS analista, dbo.AreaTecnica.ATC_Id, dbo.Tarefa.SRV_Id
FROM dbo.TarefaEtapaAreaTecnica
LEFT JOIN dbo.Tarefa ON dbo.TarefaEtapaAreaTecnica.TRF_Id = dbo.Tarefa.TRF_Id
LEFT JOIN dbo.AreaTecnica ON dbo.TarefaEtapaAreaTecnica.ATC_Id = dbo.AreaTecnica.ATC_Id
LEFT JOIN dbo.ServicoAreaTecnica ON dbo.TarefaEtapaAreaTecnica.ATC_Id = dbo.ServicoAreaTecnica.ATC_Id
AND dbo.Tarefa.SRV_Id = dbo.ServicoAreaTecnica.SRV_Id
INNER JOIN dbo.Pessoa ON dbo.Pessoa.PFJ_Id = dbo.ServicoAreaTecnica.PFJ_Id_Analista
GROUP BY dbo.AreaTecnica.ATC_Id, dbo.Tarefa.SRV_Id, dbo.Pessoa.PFJ_Descri
) AS a ON t.SRV_Id = a.SRV_Id
INNER JOIN dbo.TarefaEtapaAreaTecnica AS TarefaEtapaAreaTecnica_1 ON
t.TRF_Id = TarefaEtapaAreaTecnica_1.TRF_Id
AND a.ATC_Id = TarefaEtapaAreaTecnica_1.ATC_Id
AND a.max_TEA_InicioTarefa = TarefaEtapaAreaTecnica_1.TEA_InicioTarefa
LEFT JOIN AreaTecnica ATC ON TarefaEtapaAreaTecnica_1.ATC_Id = ATC.ATC_Id
LEFT JOIN Etapa ETS ON TarefaEtapaAreaTecnica_1.ETS_Id = ETS.ETS_Id
LEFT JOIN ParecerTipo PAT ON TarefaEtapaAreaTecnica_1.PAT_Id = PAT.PAT_Id
LEFT OUTER JOIN dbo.Servico ON a.SRV_Id = dbo.Servico.SRV_Id
INNER JOIN dbo.Intervencao ON dbo.Servico.INT_Id = dbo.Intervencao.INT_Id
LEFT JOIN dbo.Assunto ON dbo.Servico.SNT_Id = dbo.Assunto.SNT_Id
) t
The result is following:
It works good, the problem is that I was asked that if when a row is not present on this query, it must contain values from another table (ServicoAreaTecnica), so I got this query for the other table based on crucial information of the first query. So if I UNION ALL I get this:
Query1 +
UNION ALL
SELECT INN.INT_Processo,
PES.PFJ_Descri,
NULL, --ETS.ETS_Sigla,
ART.ATC_Sigla,
NULL ,--PAT.PAT_Sigla,
ASS.SNT_Peso,
NULL, --PESOAREA
NULL, --DATA_INICIO_TERMINO
NULL --seqnum
FROM dbo.ServicoAreaTecnica AS SAT
INNER JOIN dbo.AreaTecnica AS ART ON ART.ATC_Id = SAT.ATC_Id
INNER JOIN dbo.Servico AS SER ON SER.SRV_Id = SAT.SRV_Id
INNER JOIN dbo.Assunto AS ASS ON ASS.SNT_Id = SER.SNT_Id
INNER JOIN dbo.Intervencao AS INN ON INN.INT_Id = SER.INT_Id
INNER JOIN dbo.Pessoa AS PES ON PES.PFJ_Id = SAT.PFJ_Id_Analista
The result is following:
So what I want to do is to remove row number 1 because row number 2 exists on the first query, I think I got it explained better this time. The result should be only row number 1, row number 2 would appear only if query 1 doesn't retrieve a row for that particular INN.INT_Processo.
Thanks!
Ok, there are two ways to reduce your record set. Given that you've already written the code to produce the table with the extra rows, it might be easiest to just add code to reduce that:
Select * from
(Select *
, Row_Number() over
(partition by IntProcesso, Analista order by ISNULL(seqnum, 0) desc) as RN
from MyResults) a
where RN = 1
This will assign row_number 1 to any rows that came from your first query, or to any rows from the second query that do not have matches in the first query, then filter out extra rows.
You could also use outer joins with isnull or coalesce, as others have suggested. Something like this:
Select ISNULL(a.IntProcesso, b.IntProcesso) as IntProcesso
, ISNULL(a.Analista, b.Analista) as Analista
, ISNULL(a.ETSsigla, b.ETSsigla) as ETSsigla
[repeat for the rest of your columns]
from Table1 a
full outer join Table2 b
on a.IntProcesso = b.IntProcesso and a.Analista = b.Analista
Your code is hard to read, because of the lengthy names of everything (and to be honest, the fact that they're in a language I don't speak also makes it a lot harder).
But how about: replacing your INNER JOINs with LEFT JOINs, adding more LEFT JOINs to draw in the alternative tables, and introducing ISNULL clauses for each variable you want in the results?
If you do something like ... Query1 Right Join Query2 On ... that should get only the rows in Query2 that don't appear in Query 1.

TSQL - Return recent date

Having issues getting a dataset to return with one date per client in the query.
Requirements:
Must have the recent date of transaction per client list for user
Will need have the capability to run through EXEC
Current Query:
SELECT
c.client_uno
, c.client_code
, c.client_name
, c.open_date
into #AttyClnt
from hbm_client c
join hbm_persnl p on c.resp_empl_uno = p.empl_uno
where p.login = #login
and c.status_code = 'C'
select
ba.payr_client_uno as client_uno
, max(ba.tran_date) as tran_date
from blt_bill_amt ba
left outer join #AttyClnt ac on ba.payr_client_uno = ac.client_uno
where ba.tran_type IN ('RA', 'CR')
group by ba.payr_client_uno
Currently, this query will produce at least 1 row per client with a date, the problem is that there are some clients that will have between 2 and 10 dates associated with them bloating the return table to about 30,000 row instead of an idealistic 246 rows or less.
When i try doing max(tran_uno) to get the most recent transaction number, i get the same result, some have 1 value and others have multiple values.
The bigger picture has 4 other queries being performed doing other parts, i have only included the parts that pertain to the question.
Edit (2011-10-14 # 1:45PM):
select
ba.payr_client_uno as client_uno
, max(ba.row_uno) as row_uno
into #Bills
from blt_bill_amt ba
inner join hbm_matter m on ba.matter_uno = m.matter_uno
inner join hbm_client c on m.client_uno = c.client_uno
inner join hbm_persnl p on c.resp_empl_uno = p.empl_uno
where p.login = #login
and c.status_code = 'C'
and ba.tran_type in ('CR', 'RA')
group by ba.payr_client_uno
order by ba.payr_client_uno
--Obtain list of Transaction Date and Amount for the Transaction
select
b.client_uno
, ba.tran_date
, ba.tc_total_amt
from blt_bill_amt ba
inner join #Bills b on ba.row_uno = b.row_uno
Not quite sure what was going on but seems the Temp Tables were not acting right at all. Ideally i would have 246 rows of data, but with the previous query syntax it would produce from 400-5000 rows of data, obviously duplications on data.
I think you can use ranking to achieve what you want:
WITH ranked AS (
SELECT
client_uno = ba.payr_client_uno,
ba.tran_date,
be.tc_total_amt,
rnk = ROW_NUMBER() OVER (
PARTITION BY ba.payr_client_uno
ORDER BY ba.tran_uno DESC
)
FROM blt_bill_amt ba
INNER JOIN hbm_matter m ON ba.matter_uno = m.matter_uno
INNER JOIN hbm_client c ON m.client_uno = c.client_uno
INNER JOIN hbm_persnl p ON c.resp_empl_uno = p.empl_uno
WHERE p.login = #login
AND c.status_code = 'C'
AND ba.tran_type IN ('CR', 'RA')
)
SELECT
client_uno,
tran_date,
tc_total_amt
FROM ranked
WHERE rnk = 1
ORDER BY client_uno
Useful reading:
Ranking Functions (Transact-SQL)
ROW_NUMBER (Transact-SQL)
WITH common_table_expression (Transact-SQL)
Using Common Table Expressions

Resources