Understand below SQL Query - sql-server

I need your help to understand below query can anyone help me to describe it, i wanted to know the role of b.id is null and r.id is null in below query if anyone can explain whole code then it would be great?
select l.id as start,
(
select min(a.id) as id
from sequence as a
left outer join sequence as b on a.id = b.id - 1
where b.id is null
and a.id >= l.id
) as end
from sequence as l
left outer join sequence as r on r.id = l.id - 1
where r.id is null;

This query returns your "islands"of your sequence, i.e. start and end of continuos id intervals.
You can read more on gaps and islands here: Special Islands and here The SQL of Gaps and Islands in Sequences

The query is finding islands of consecutive numbers, outputting the start and end of ranges where all consecutive numbers are there, so for the set if numbers
{1,3,4,5,6,9,10}
I would expect
1,1
2,6
9,10
to be selected
The outer query starts by finding number (N) that cannot join to a record holding N-1, detected by r.id is null
The sub query then finds the next highest number (M) that does not join to a record holding M+1 (detected using b.id is null)
so in my example 3 does not have a '2' to join to, meaning 3 begins a range. The first number >= to that with no subsequent record is 6, which has no '7' to make a join to

The query is joining two tables (in this case the same table but it does not matter) using an outer join.
if we have
SELECT t1.a, t1.b, t2.d
FROM table1 t1
LEFT OUTER JOIN table2 t2 ON t1.a = t2.a
WHERE t2.a is null
This means it returns a set of records with all those from table1, however its possible there will not be a joined record from table2 for every record in table1. If there isn't the table2 fields are returned as null so the the WHERE clause is effectively saying return me all records from table1 where we do not have a join record in table2.
In your example where it joins onto itself you are looking where there is not a next/previous record (depending upon which way you are looking at it) based upon the id e.g. if id = 5, then there is not a record with id = 4
Overall the sql as a whole looks like its returning the consecutive id ranges in the sequence table.

Related

How to Left Inner Join two queries in Sybase?

I have two queries that should be joined together. Here is my query 1:
SELECT
t1.rec_id,
t1.category,
t1.name,
t1.code,
CASE
WHEN t1.name= 'A' THEN SUM(t1.amount)
WHEN t1.name = 'D' THEN SUM(t1.amount)
WHEN t1.name = 'H' THEN SUM(t1.amount)
WHEN t1.name = 'J' THEN SUM(t1.amount)
END AS Amount
FROM Table1 t1
GROUP BY t1.name, t1.rec_id, t1.category, t1.code
Query 1 produce this set of results:
Rec ID Category Name Code Amount
1 1 A MIX 70927.00
1 3 D MIX 19922.00
1 2 H MIX 55104.00
1 4 J MIX 76938.00
Then I have query 2:
SELECT
CASE
WHEN t2.category_id = 1 THEN SUM(t2.sum)
WHEN t2.category_id = 2 THEN SUM(t2.sum)
WHEN t2.category_id = 3 THEN SUM(t2.sum)
WHEN t2.category_id = 4 THEN SUM(t2.sum)
END AS TotalSum
FROM Table2 t2
INNER JOIN Table1 t1
ON t1.amnt_id = t2.amnt_id
AND t2.unique_id = #unique_id
GROUP BY t2.category_id
The result set of query 2 is this:
TotalSum
186013.00
47875.00
12136.00
974602.00
All I need is this result set that combines query 1 and query 2:
Rec ID Category Name Code Amount TotalSum
1 1 A MIX 70927.00 186013.00
1 3 D MIX 19922.00 47875.00
1 2 H MIX 55104.00 12136.00
1 4 J MIX 76938.00 974602.00
As you can see there is connection between table 1 and table 2. That connection is amnt_id. However, I tried doing LEFT INNER JOIN on query 1 and then simply using same logic with case statement to get the total sum for table 2. Unfortunately Sybase version that I use does not support Left Inner Join. I'm wondering if there is other way to join these two queries? Thank you
I wondered if the CASE statement makes sense in the first query because it sums in every row. Are there other values for the name column except A, D, H, J? If not you can change the CASE statement to SUM(t1.amount) AS Amount. Also the GROUP BY in the first query seems dubious to me: you are grouping by the record id column - that means you are not grouping at all but instead return every row. If that is what you really want you can omit the SUM at all and just return the pure amount column.
As far as I understood your problem and your data structure: the values in Table2 are kind of category sums and the values in Table1 are subsets. You would like to see the category sum for every category in Table1 next to the single amounts?
You would typically use a CTE (common table expression, "WITH clause") but ASE doesn't support CTEs, so we have to work with joins. I recreated your tables in my SQL Anywhere database and put together this example. In a nutshell: both queries are subqueries in an outer query and are left joined on the category id:
SELECT *
FROM
(
SELECT
t1.rec_id,
t1.category,
t1.name,
t1.code,
CASE
WHEN t1.name= 'A' THEN SUM(t1.amount)
WHEN t1.name = 'D' THEN SUM(t1.amount)
WHEN t1.name = 'H' THEN SUM(t1.amount)
WHEN t1.name = 'J' THEN SUM(t1.amount)
END AS Amount
FROM Table1 t1
GROUP BY t1.rec_id, t1.name, t1.category, t1.code
) AS t1
LEFT JOIN
(
SELECT category_id, SUM(sum) FROM
table2
GROUP BY category_id
) AS totals(category_id, total_sum)
ON totals.category_id = t1.category;
This query gives me:
Rec ID Category Name Code Amount Category_id total_sum
2 3 D MIX 19922.00 3 47875.00
3 2 H MIX 55104.00 2 12136.00
1 1 A MIX 70927.00 1 186013.00
4 4 J MIX 76938.00 4 974602.00
You surely have to tweak it a bit including your t2.unique_id column (that I don't understand from your queries) but this is a practical way to work around ASE's missing CTE feature.
BTW: it's either an INNER JOIN (only the corresponding records from both tables) or a LEFT (OUTER) JOIN (all from the left, only the corresponding records from the right table) but a LEFT INNER JOIN makes no sense.

IS NULL being ignored

I am trying to run a query in T-SQL to pull back a data set based on a column being null.
This is a simplified version of the code:
SELECT
T1.Col1, T1.Col2,
T1.Col3, T1.Col4
FROM
table1 AS T1
INNER JOIN
table2 AS T2 ON T1.Col2 = T2.Col3
WHERE
T2.Col4 IS NULL
Problem is, the result includes rows where T2.Col4 are NULL and also not NULL, it's like the WHERE clause doesn't exist.
Any ideas would be greatly
UPDATE - full version of code:
SELECT
M.ref
,C.cname
,CL.clname
,C.ccity
,M.productLine
,M.code
,CL.date
,M.dept
,DPT.group
,TK2.tkname
,TK2.tkdept
FROM DB.dbo.manage AS M
OUTER JOIN DB.dbo.ClientManageRelationship AS CMR
ON CMR.RelatedEntityID = M.EntityID
OUTER JOIN DB.dbo.Client AS C
ON C.EntityID = CMR.EntityID
INNER JOIN DB.dbo.ManageCustomerRelationship AS MCR
ON MCR.EntityID = M.EntityID
INNER JOIN DB.dbo.Customer AS CL
ON CL.EntityID = MCR.RelatedID
INNER JOIN DB.dbo.timek AS TK
ON TK.tki = M.tkid
LEFT JOIN (SELECT Group = division, [Department] = newdesc, deptcode FROM DB.csrt.vw_rep_p_l_dept) AS DPT
ON tkdept = DPT.dept
LEFT JOIN (SELECT Name = TK2.tkfirst + ' ' + TK2.tklast, TK2.tki, TK2.dept, TK2.loc FROM DB.dbo.timek as TK2 WITH(NOLOCK)) AS TK2
ON TK2.tki = M.tkid
WHERE DPT.Department = 'Casualty'
AND UPPER (C.ClientName) LIKE '%LIMITED%'
AND CL.date > '31/12/2014'
AND CL.Date IS NULL
AND TK.tkloc = 'loc1' OR TK.tkloc = 'loc2'
ORDER BY M.ref
My first answer would be because you're using INNER JOIN. This only returns matches between the 2 tables. TRY FULL OUTER JOIN which will return all values regardless of matches and will include NULLS.
If you were looking to return all rows regardless of matches including NULLS from only one of the tables then use RIGHT or LEFT JOIN.
Say i had 2 tables ('Person' and 'Figure'). Not every person may have entered a figure on any one day. But an example may be i want to return all people regardless of whether they entered a figure or not on a certain day.
My initial approach to this would be a LEFT join because i want to return of all the people(left table) regardless of there being any matches in the figure table(right table)
FROM Person P
LEFT JOIN Figure F
ON P.ID = F.ID
This would produce a result such as
Name Figure
Sam 20
Ben 30
Matt NULL
Simon NULL
Whereas,
An inner join would produce only matching values not including nulls
Name Figure
Sam 20
Ben 30
Left join works the same way as right join but in the opposite direction. This is most likely the problem you were facing. But i hope this helped
I think the problem is in the last part of the where condition.
You should use brackets.
`WHERE DPT.Department = 'Casualty'
AND UPPER (C.ClientName) LIKE '%LIMITED%'
AND CL.date > '31/12/2014'
AND CL.Date IS NULL
AND (TK.tkloc = 'loc1' OR TK.tkloc = 'loc2')`
or
`WHERE DPT.Department = 'Casualty'
AND UPPER (C.ClientName) LIKE '%LIMITED%'
AND CL.date > '31/12/2014'
AND CL.Date IS NULL
AND TK.tkloc IN ('loc1', 'loc2')`

RIGHT\LEFT Join does not provide null values without condition

I have two tables one is the lookup table and the other is the data table. The lookup table has columns named cycleid, cycle. The data table has SID, cycleid, cycle. Below is the structure of the tables.
If you check the data table, the SID may have all the cycles and may not have all the cycles. I want to output the SID completed as well as missed cycles.
I right joined the lookup table and retrieved the missing as well as completed cycles. Below is the query I used.
SELECT TOP 1000 [SID]
,s4.[CYCLE]
,s4.[CYCLEID]
FROM [dbo].[data] s3 RIGHT JOIN
[dbo].[lookup_data] s4 ON s3.CYCLEID = s4.CYCLEID
The query is not displaying me the missed values when I query for all the SID's. When I specifically query for a SID with the below query i am getting the correct result including the missed ones.
SELECT TOP 1000 [SID]
,s4.[CYCLE]
,s4.[CYCLEID]
FROM [dbo].[data] s3 RIGHT JOIN [dbo].[lookup_data] s4
ON s3.CYCLEID = s4.CYCLEID
AND s3.SID = 101002
ORDER BY [SID], s4.[CYCLEID]
As I am supplying this query into tableau I cannot provide the sid value in the query. I want to return all the sid's and from tableau I will be do the rest of the things.
The expected output that i need is as shown below.
I wrote a cross join query like below to acheive my expected output
SELECT DISTINCT
tab.CYCLEID
,tab.SID
,d.CYCLE
FROM ( SELECT d.SID
,d.[CYCLE]
,e.CYCLEID
FROM ( SELECT e.sid
,e.CYCLE
FROM [db_temp].[dbo].[Sheet3$] e
) d
CROSS JOIN [db_temp].[dbo].[Sheet4$] e
) tab
LEFT OUTER JOIN [db_temp].[dbo].[Sheet3$] d
ON d.CYCLEID = tab.CYCLEID
AND d.SID = tab.SID
ORDER BY tab.SID
,tab.CYCLEID;
However I am not able to use this query for more scenarios as my data set have nearly 20 to 40 columns and i am having issues when i use the above one.
Is there any way to do this in a simpler manner with only left or right join itself? I want the query to return all the missing values and the completed values for the all the SID's instead of supplying a single sid in the query.
You can create a master table first (combine all SID and CYCLE ID), then right join with the data table
;with ctxMaster as (
select distinct d.SID, l.CYCLE, l.CYCLEID
from lookup_data l
cross join data d
)
select d.SID, m.CYCLE, m.CYCLEID
from ctxMaster m
left join data d on m.SID = d.SID and m.CYCLEID = d.CYCLEID
order by m.SID, m.CYCLEID
Fiddle
Or if you don't want to use common table expression, subquery version:
select d.SID, m.CYCLE, m.CYCLEID
from (select distinct d.SID, l.CYCLE, l.CYCLEID
from lookup_data l
cross join data d) m
left join data d on m.SID = d.SID and m.CYCLEID = d.CYCLEID
order by m.SID, m.CYCLEID

Why do I have duplicate records in my JOIN

I am retrieving data from table ProductionReportMetrics where I have column NetRate_QuoteID. Then to that result set I need to get Description column.
And in order to get a Description column, I need to join 3 tables:
NetRate_Quote_Insur_Quote
NetRate_Quote_Insur_Quote_Locat
NetRate_Quote_Insur_Quote_Locat_Liabi
But after that my premium is completely off.
What am I doing wrong here?
SELECT QLL.Description,
QLL.ClassCode,
prm.NetRate_QuoteID,
QL.LocationID,
ISNULL(SUM(premium),0) AS NetWrittenPremium,
MONTH(prm.EffectiveDate) AS EffMonth
FROM ProductionReportMetrics prm
LEFT JOIN NetRate_Quote_Insur_Quote Q
ON prm.NetRate_QuoteID = Q.QuoteID
INNER JOIN NetRate_Quote_Insur_Quote_Locat QL
ON Q.QuoteID = QL.QuoteID
INNER JOIN NetRate_Quote_Insur_Quote_Locat_Liabi QLL
ON QL.LocationID = QLL.LocationID
WHERE YEAR(prm.EffectiveDate) = 2016 AND
CompanyLine = 'Ironshore Insurance Company'
GROUP BY MONTH(prm.EffectiveDate),
QLL.Description,
QLL.ClassCode,
prm.NetRate_QuoteID,
QL.LocationID
I think the problem in this table:
What Am I missing in this Query?
select
ClassCode,
QLL.Description,
sum(Premium)
from ProductionReportMetrics prm
LEFT JOIN NetRate_Quote_Insur_Quote Q ON prm.NetRate_QuoteID = Q.QuoteID
LEFT JOIN NetRate_Quote_Insur_Quote_Locat QL ON Q.QuoteID = QL.QuoteID
LEFT JOIN
(SELECT * FROM NetRate_Quote_Insur_Quote_Locat_Liabi nqI
JOIN ( SELECT LocationID, MAX(ClassCode)
FROM NetRate_Quote_Insur_Quote_Locat_Liabi GROUP BY LocationID ) nqA
ON nqA.LocationID = nqI.LocationID ) QLL ON QLL.LocationID = QL.LocationID
where Year(prm.EffectiveDate) = 2016 AND CompanyLine = 'Ironshore Insurance Company'
GROUP BY Q.QuoteID,QL.QuoteID,QL.LocationID
Now it says
Msg 8156, Level 16, State 1, Line 14
The column 'LocationID' was specified multiple times for 'QLL'.
It looks like DVT basically hit on the answer. The only reason you would get different amounts(i.e. duplicated rows) as a result of a join is that one of the joined tables is not a 1:1 relationship with the primary table.
I would suggest you do a quick check against those tables, looking for table counts.
--this should be your baseline count
SELECT COUNT(*)
FROM ProductionReportMetrics
GROUP BY MONTH(prm.EffectiveDate),
prm.NetRate_QuoteID
--this will be a check against the first joined table.
SELECT COUNT(*)
FROM NetRate_Quote_Insur_Quote Q
WHERE QuoteID IN
(SELECT NetRate_QuoteID
FROM ProductionReportMetrics
GROUP BY MONTH(prm.EffectiveDate),
prm.NetRate_QuoteID)
Basically you will want to do a similar check against each of your joined tables. If any of the joined tables are part of the grouping statement, make sure they are also in the grouping of the count check statement. Also make sure to alter the WHERE clause of the check count statement to use the join clause columns you were using.
Once you find a table that returns the incorrect number of rows, you will have your answer as to what table is causing the problem. Then you will just have to decide how to limit that table down to distinct rows(some type of aggregation).
This advice is really just to show you how to QA this particular query. Break it up into the smallest possible parts. In this case, we know that it is a join that is causing the problem, so take it one join at a time until you find the offender.

How to join one select with another when the first one not always returns a value for specific row?

I have a complex query to retrieve some results:
EDITED QUERY (added the UNION ALL):
SELECT t.*
FROM (
SELECT
dbo.Intervencao.INT_Processo, analista,
ETS.ETS_Sigla, ATC.ATC_Sigla, PAT.PAT_Sigla, dbo.Assunto.SNT_Peso,
CASE
WHEN ETS.ETS_Sigla = 'PE' AND (PAT.PAT_Sigla = 'LIB' OR PAT.PAT_Sigla = 'LBR') THEN (0.3*SNT_Peso)
WHEN ETS.ETS_Sigla = 'CD' THEN (0.3*SNT_Peso)*0.3
ELSE SNT_Peso
END AS PESOAREA,
CASE
WHEN a.max_TEA_FimTarefa IS NULL THEN a.max_TEA_InicioTarefa
ELSE a.max_TEA_FimTarefa
END AS DATA_INICIO_TERMINO,
ROW_NUMBER() OVER (PARTITION BY ATC.ATC_Sigla, a.SRV_Id ORDER BY TEA_FimTarefa DESC) AS seqnum
FROM dbo.Tarefa AS t
INNER JOIN (
SELECT
MAX(dbo.TarefaEtapaAreaTecnica.TEA_InicioTarefa) AS max_TEA_InicioTarefa,
MAX (dbo.TarefaEtapaAreaTecnica.TEA_FimTarefa) AS max_TEA_FimTarefa,
dbo.Pessoa.PFJ_Descri AS analista, dbo.AreaTecnica.ATC_Id, dbo.Tarefa.SRV_Id
FROM dbo.TarefaEtapaAreaTecnica
LEFT JOIN dbo.Tarefa ON dbo.TarefaEtapaAreaTecnica.TRF_Id = dbo.Tarefa.TRF_Id
LEFT JOIN dbo.AreaTecnica ON dbo.TarefaEtapaAreaTecnica.ATC_Id = dbo.AreaTecnica.ATC_Id
LEFT JOIN dbo.ServicoAreaTecnica ON dbo.TarefaEtapaAreaTecnica.ATC_Id = dbo.ServicoAreaTecnica.ATC_Id
AND dbo.Tarefa.SRV_Id = dbo.ServicoAreaTecnica.SRV_Id
INNER JOIN dbo.Pessoa ON dbo.Pessoa.PFJ_Id = dbo.ServicoAreaTecnica.PFJ_Id_Analista
GROUP BY dbo.AreaTecnica.ATC_Id, dbo.Tarefa.SRV_Id, dbo.Pessoa.PFJ_Descri
) AS a ON t.SRV_Id = a.SRV_Id
INNER JOIN dbo.TarefaEtapaAreaTecnica AS TarefaEtapaAreaTecnica_1 ON
t.TRF_Id = TarefaEtapaAreaTecnica_1.TRF_Id
AND a.ATC_Id = TarefaEtapaAreaTecnica_1.ATC_Id
AND a.max_TEA_InicioTarefa = TarefaEtapaAreaTecnica_1.TEA_InicioTarefa
LEFT JOIN AreaTecnica ATC ON TarefaEtapaAreaTecnica_1.ATC_Id = ATC.ATC_Id
LEFT JOIN Etapa ETS ON TarefaEtapaAreaTecnica_1.ETS_Id = ETS.ETS_Id
LEFT JOIN ParecerTipo PAT ON TarefaEtapaAreaTecnica_1.PAT_Id = PAT.PAT_Id
LEFT OUTER JOIN dbo.Servico ON a.SRV_Id = dbo.Servico.SRV_Id
INNER JOIN dbo.Intervencao ON dbo.Servico.INT_Id = dbo.Intervencao.INT_Id
LEFT JOIN dbo.Assunto ON dbo.Servico.SNT_Id = dbo.Assunto.SNT_Id
) t
The result is following:
It works good, the problem is that I was asked that if when a row is not present on this query, it must contain values from another table (ServicoAreaTecnica), so I got this query for the other table based on crucial information of the first query. So if I UNION ALL I get this:
Query1 +
UNION ALL
SELECT INN.INT_Processo,
PES.PFJ_Descri,
NULL, --ETS.ETS_Sigla,
ART.ATC_Sigla,
NULL ,--PAT.PAT_Sigla,
ASS.SNT_Peso,
NULL, --PESOAREA
NULL, --DATA_INICIO_TERMINO
NULL --seqnum
FROM dbo.ServicoAreaTecnica AS SAT
INNER JOIN dbo.AreaTecnica AS ART ON ART.ATC_Id = SAT.ATC_Id
INNER JOIN dbo.Servico AS SER ON SER.SRV_Id = SAT.SRV_Id
INNER JOIN dbo.Assunto AS ASS ON ASS.SNT_Id = SER.SNT_Id
INNER JOIN dbo.Intervencao AS INN ON INN.INT_Id = SER.INT_Id
INNER JOIN dbo.Pessoa AS PES ON PES.PFJ_Id = SAT.PFJ_Id_Analista
The result is following:
So what I want to do is to remove row number 1 because row number 2 exists on the first query, I think I got it explained better this time. The result should be only row number 1, row number 2 would appear only if query 1 doesn't retrieve a row for that particular INN.INT_Processo.
Thanks!
Ok, there are two ways to reduce your record set. Given that you've already written the code to produce the table with the extra rows, it might be easiest to just add code to reduce that:
Select * from
(Select *
, Row_Number() over
(partition by IntProcesso, Analista order by ISNULL(seqnum, 0) desc) as RN
from MyResults) a
where RN = 1
This will assign row_number 1 to any rows that came from your first query, or to any rows from the second query that do not have matches in the first query, then filter out extra rows.
You could also use outer joins with isnull or coalesce, as others have suggested. Something like this:
Select ISNULL(a.IntProcesso, b.IntProcesso) as IntProcesso
, ISNULL(a.Analista, b.Analista) as Analista
, ISNULL(a.ETSsigla, b.ETSsigla) as ETSsigla
[repeat for the rest of your columns]
from Table1 a
full outer join Table2 b
on a.IntProcesso = b.IntProcesso and a.Analista = b.Analista
Your code is hard to read, because of the lengthy names of everything (and to be honest, the fact that they're in a language I don't speak also makes it a lot harder).
But how about: replacing your INNER JOINs with LEFT JOINs, adding more LEFT JOINs to draw in the alternative tables, and introducing ISNULL clauses for each variable you want in the results?
If you do something like ... Query1 Right Join Query2 On ... that should get only the rows in Query2 that don't appear in Query 1.

Resources