SQL Server joining multiple CTE - sql-server

I have 4 Common Table Expressions each contains 2 columns
(RowNumber, AccountNumber) but contain variable records in each CTE depending upon the query parameters. Purpose is to keep all non null account numbers at the top for each CTE after joining.
I am joining 4 CTE's using FULL Join on the basis of RowNumber. The problem I am getting is the sequence of AccountNumber is not continuous i.e. it includes some null values in between Accountnumber in some cases. I want to keep all non null values always combined and at the top with nulls. The number of AccountNumber's in each CTE are always different.
SELECT​
ISNULL(Cte_FirstYear.AccountNumber,'') as FirstYear,​
ISNULL(Cte_SecondYear.AccountNumber,'') as SecondYear,​
ISNULL(cte_ThirdYear.AccountNumber,'') as ThirdYear,​
ISNULL(cte_FourthYear.AccountNumber,'') as FourthYear​
FROM cte_ThirdYear​
FULL OUTER JOIN​
cte_FirstYear
on ​
cte_ThirdYear.RowNumber=cte_FirstYear.RowNumber​​
full outer join Cte_SecondYear​
on ​
cte_ThirdYear.RowNumber=Cte_SecondYear.RowNumber​​
full outer join cte_FourthYear​
on ​
cte_ThirdYear.RowNumber=cte_FourthYear.RowNumber​​
Here is how I am getting the output;
FirstYear SecondYear ThirdYear FourthYear
1 2 3 4
5 6 7 1
9 NULL NULL
NULL
9 9
10 NULL
Here is expected output;
FirstYear SecondYear ThirdYear FourthYear
1 2 3 4
5 6 7 1
9 9 9
10

Based on the explained by #Donnie in the link
A cross join produces a cartesian product between the two tables, returning all possible combinations of all rows. It has no on clause because you're just joining everything to everything.
A full outer join is a combination of a left outer and right outer join. It returns all rows in both tables that match the query's where clause, and in cases where the on condition can't be satisfied for those rows it puts null values in for the unpopulated fields.
You can add this line to ignore :
on cte_ThirdYear.RowNumber=cte_FirstYear.RowNumber​​ and on ​ cte_ThirdYear.RowNumber is not null and cte_FirstYear.RowNumber​​ is not null
Read this pdf : http://stevestedman.com/wp-content/uploads/TSqlJoinTypePoster1.pdf

I created another CTE which takes the maximum records from 4 ctes and generates RowNumber 1 to N (Max. Number of records in 4 CTE's) and joined all 4 CTE's with it using LEFT JOIN.
Here is how I modified the query to achieve the result;
Cte_Max(RowNumber) AS (
SELECT TOP
(
select max(c) from
(
select count(*) c from cte_FirstYear
UNION
select count(*) c from Cte_SecondYear
UNION
select count(*) c from cte_ThirdYear
UNION
select count(*) c from cte_FourthYear
) x
)
ROW_NUMBER() OVER (ORDER BY c1.id asc) as RowNumber
FROM syscolumns AS c1
CROSS JOIN syscolumns AS c2
)
select
ISNULL(cte_FirstYear.AccountNumber,'') as FirstYear,​
ISNULL(Cte_SecondYear.AccountNumber,'') as SecondYear,​
ISNULL(cte_ThirdYear.AccountNumber,'') as ThirdYear,​
ISNULL(cte_FourthYear.AccountNumber,'') as FourthYear
from Cte_Max
LEFT join cte_FirstYear
on
Cte_Max.RowNumber=cte_FirstYear.RowNumber
LEFT join
Cte_SecondYear
on
Cte_Max.RowNumber=Cte_SecondYear.RowNumber
LEFT join
cte_ThirdYear
on
Cte_Max.RowNumber =cte_ThirdYear.RowNumber
LEFT join
cte_FourthYear
on
Cte_Max.RowNumber =cte_FourthYear.RowNumber
ORDER BY Cte_Max.RowNumber

Related

Getting non-deterministic results from WITH RECURSIVE cte

I'm trying to create a recursive CTE that traverses all the records for a given ID, and does some operations between ordered records. Let's say I have customers at a bank who get charged a uniquely identifiable fee, and a customer can pay that fee in any number of installments:
WITH recursive payments (
id
, index
, fees_paid
, fees_owed
)
AS (
SELECT id
, index
, fees_paid
, fee_charged
FROM table
WHERE index = 1
UNION ALL
SELECT t.id
, t.index
, t.fees_paid
, p.fees_owed - p.fees_paid
FROM table t
JOIN payments p
ON t.id = p.id
AND t.index = p.index + 1
)
SELECT *
FROM payments
ORDER BY 1,2;
The join logic seems sound, but when I join the output of this query to the source table, I'm getting non-deterministic and incorrect results.
This is my first foray into Snowflake's recursive CTEs. What am I missing in the intermediate result logic that is leading to the non-determinism here?
I assume this is edited code, because in the anchor of you CTE you select the fourth column fee_charged which does not exist, and then in the recursion you don't sum the fees paid and other stuff, basically you logic seems rather strange.
So creating some random data, that has two different id streams to recurse over:
create or replace table data (id number, index number, val text);
insert into data
select * from values (1,1,'a'),(2,1,'b')
,(1,2,'c'), (2,2,'d')
,(1,3,'e'), (2,3,'f')
v(id, index, val);
Now altering you CTE just a little bit to concat that strings together..
WITH RECURSIVE payments AS
(
SELECT id
, index
, val
FROM data
WHERE index = 1
UNION ALL
SELECT t.id
, t.index
, p.val || t.val as val
FROM data t
JOIN payments p
ON t.id = p.id
AND t.index = p.index + 1
)
SELECT *
FROM payments
ORDER BY 1,2;
we get:
ID INDEX VAL
1 1 a
1 2 ac
1 3 ace
2 1 b
2 2 bd
2 3 bdf
Which is exactly as I would expect. So how this relates to your "it gets strange when I join to other stuff" is ether, your output of you CTE is not how you expect it to be.. Or your join to other stuff is not working as you expect, Or there is a bug with snowflake.
Which all comes down to, if the CTE results are exactly what you expect, create a table and join that to your other table, so eliminate some form of CTE vs JOIN bug, and to debug why your join is not working.
But if your CTE output is not what you expect, then lets help debug that.

Selecting Max with Lots of Other Items

Sorry for the poor title. I wasn't sure how to describe my problem. I've written a query that returns about 23,000 records. A lot of those records have similar information and I want to only select the records with the maximum of the field dbo.tblMsgsOnAir_Type8.fldBuddyLinkSigStrength. I've tried grouping by all of the other columns being selected, but it doesn't appear to work correctly. I don't fully understand SQL, especially the max and group functions. I can do simple max functions when I only want or need to select one thing. I don't understand how it works when I want to select a bunch of other data. Below is the query.
SELECT
dbo.tblmeterinfo.fldMeterSerialNumber AS "MOP_FNP_Meter",
dbo.tblMsgsOnAir_Type8.fldRBuddyId AS "MOP_FNP_FNID",
dbo.TBLMETERMAINT.fldmeterid AS "Meter_ID_Helped",
dbo.tblMsgsOnAir_Type8.fldCBuddyId AS "FNID_Helped",
dbo.fn_dt(dbo.tblMsgsOnAir_Type8.fldRBuddyToi) AS "TOI",
dbo.tblMsgsOnAir_Type8.fldBuddyLinkSigStrength AS "Sig_Str",
dbo.TBLSAWN_CIS_INFO.SML AS "Buddy_SML",
dbo.TBLMETERLIST.fldaddress AS "Buddy_Address",
dbo.TBLSAWNGISCOORD.X_COORD AS "X_Coord",
dbo.TBLSAWNGISCOORD.Y_COORD AS "Y_Coord"
FROM dbo.tblMsgsOnAir_Type8
LEFT OUTER JOIN dbo.TBLMETERLIST
ON (dbo.TBLMETERLIST.FLDREPID = dbo.tblMsgsOnAir_Type8.fldCBuddyId)
LEFT OUTER JOIN dbo.TBLMETERMAINT
ON (dbo.TBLMETERMAINT.FLDREPID = dbo.tblMsgsOnAir_Type8.fldCBuddyID)
LEFT OUTER JOIN dbo.TBLSAWN_CIS_INFO
ON (dbo.TBLSAWN_CIS_INFO.FLDREPID = dbo.tblMsgsOnAir_Type8.fldCBuddyId)
LEFT OUTER JOIN dbo.TBLSAWNGISCOORD
ON (dbo.TBLSAWNGISCOORD.SRV_MAP_LOC = dbo.TBLSAWN_CIS_INFO.SML)
LEFT OUTER JOIN dbo.tblmeterinfo
ON (dbo.tblmeterinfo.fldRepId = dbo.tblMsgsOnAir_Type8.fldRBuddyId)
WHERE dbo.tblMsgsOnAir_Type8.fldRBuddyId IN (SELECT
dbo.tblSAWN_FNPmap.Repid
FROM dbo.tblSAWN_FNPmap)
AND dbo.TBLMETERMAINT.fldmeterid IS NOT NULL
The query below is simple and does what I want, but doesn't get all of the other field. This query only returns 617 records. I would like the above query to return 617 records, but include all of the other information I've selected.
SELECT
dbo.TBLMETERMAINT.fldmeterid AS "Meter_ID_Helped",
MAX(dbo.tblMsgsOnAir_Type8.fldBuddyLinkSigStrength) AS "Max_Sig"
FROM dbo.tblMsgsOnAir_Type8
LEFT OUTER JOIN dbo.TBLMETERMAINT
ON (dbo.TBLMETERMAINT.FLDREPID = dbo.tblMsgsOnAir_Type8.fldCBuddyID)
WHERE dbo.tblMsgsOnAir_Type8.fldRBuddyId IN (SELECT
dbo.tblSAWN_FNPmap.Repid
FROM dbo.tblSAWN_FNPmap)
AND dbo.TBLMETERMAINT.fldmeterid IS NOT NULL
GROUP BY dbo.TBLMETERMAINT.fldmeterid
Probably row_number() to the rescue. You can use it to find the best records in a set, with a grouping by some subset or other. Something like
select *
from ....
where row_number over (partition by id order by fldBuddyLinkSigStrength) = 1
So SQL Server assigns a row number within the groups. Each record will be sub-grouped by id, in this case, and given 1 if it's the best strength, 2 if it's next, etc.
If you are getting duplicates have you tried using SELECT DISTINCT?
Basically how Max works is that it will select the highest value in the group.
So if you have a table:
ID | VALUE
1 | 10
1 | 7
1 | 9
2 | 6
2 | 8
And do
SELECT ID, MAX(VALUE)
FROM TABLE
GROUP BY ID
You'll get the max value per ID
ID | VALUE
1 | 10
2 | 8
If you want to get the Max while not grouping the result then you can do the group in a subselect
SELECT ID, VALUE, MAX_VALUE etc etc
FROM TABLE
JOIN ( SELECT ID, MAX(VALUE) AS MAX_VALUE FROM TABLE GROUP BY ID) as MAX ON MAX.ID = TABLE.ID
Without knowing your table structures in more detail I can't be sure this is the best way, but here's something that should work. Use the 2nd query as the left side of a left join, to pick up the extra columns:
select a.*
from (<your 2nd query>) a
left join dbo.TBLMETERLIST
on (a.FLDREPID = dbo.tblMsgsOnAir_Type8.fldCBuddyId)
left join <next table> ...
and so on. You'll also have to left join on dbo.tblMsgsOnAir_Type8 in order to pick up the columns in that table, so that's one additional left join beyond what your first query does. By the way, it's a good idea to post code here laid out so it's readable; it makes it a lot easier for others to understand.

SQL Server 2008 Is it Possible to Have Select Top Return Nulls

(Select top 1 pvd.Code from PatientVisitDiags pvd
where pvd.PatientVisitId = pv.PatientVisitId
Order By pvd.Listorder) as "DX1",
(Select top 1 a.code from (Select top 2 pvd.Code,pvd.ListOrder from PatientVisitDiags pvd
where pvd.PatientVisitId = pv.PatientVisitId
Order By pvd.Listorder)a order by a.ListOrder DESC ) as "DX2",
(Select top 1 a.code from (Select top 3 pvd.Code,pvd.ListOrder from PatientVisitDiags pvd
where pvd.PatientVisitId = pv.PatientVisitId
Order By pvd.Listorder)a order by a.ListOrder DESC ) as "DX3",
(Select top 1 a.code from (Select top 4 pvd.Code,pvd.ListOrder from PatientVisitDiags pvd
where pvd.PatientVisitId = pv.PatientVisitId
Order By pvd.Listorder)a order by a.ListOrder DESC ) as "DX4",
(Select top 1 a.code from (Select top 5 pvd.Code,pvd.ListOrder from PatientVisitDiags pvd
where pvd.PatientVisitId = pv.PatientVisitId
Order By pvd.Listorder)a order by a.ListOrder DESC ) as "DX5"
The above code is what I am using currently (It is not optimal but is only being used once for a one time Data Export).
In the database that we are currently exporting from, there is a table PatientVisitDiags that has columns "ListOrder" and "Code". There can be between 1 and 5 codes. The ListOrder holds the number of that code. For example:
ListOrder|Code |
1 |M51.27 |
2 |M54.17 |
3 |G83.4 |
I am trying to export the Code to its corresponding Column in the new table(DX1,DX2..etc). If I sort by ListOrder I can get them in the order I need (Row 1 to DX1 | Row 2 to DX2 etc.) However when I run the above SQL code, If the source table only has 3 Codes DX4 and DX5 will repeat DX3. For Example:
DX1 |DX2 |DX3 |DX4 |DX5
M51.27 |M54.17 |G83.4 |G83.4 |G83.4
Is there a way to have TOP return NULL values if you Select TOP more than what is given? SQL Sever 2008 does not allow for OFFSET/FETCH, this is what I normally would have done given the option to select individual rows.
TL:DR
ID | Name
1 | Joe
2 | Eric
3 | Steve
4 | John
If I have a table like above and run
SELECT TOP 5 Name FROM Table
Is there anyway to return?
Joe
Eric
Steve
John
NULL
What you're really doing is pivoting. So pivot! Try this little query:
WITH Top5 AS (
SELECT TOP 5
Dx = 'DX' + Convert(varchar(11), Row_Number() OVER (ORDER BY pvd.Listorder)),
pvd.Code
FROM dbo.PatientVisitDiags pvd
WHERE pvd.PatientVisitId = #patientVisitId
)
SELECT *
FROM
Top5 t
PIVOT (Max(Code) FOR Dx IN (DX1, DX2, DX3, DX4, DX5)) p
;
To answer your second question about getting an unpivoted rowset, basically do the same thing but provide the 5 rows somehow and left join to the desired data.
WITH Data AS (
SELECT TOP 5
Seq = Row_Number() OVER(ORDER BY ID),
Name
FROM dbo.Table
ORDER BY ID
)
SELECT
n.Seq,
t.Name
FROM
(VALUES
(1), (2), (3), (4), (5) -- or a numbers-generating CTE perhaps
) n (Seq)
LEFT JOIN Top 5 t
ON n.Seq = t.Seq
;
Side note
The fact that you're doing this:
where pvd.PatientVisitId = pv.PatientVisitId
tells me you're not using ANSI joins. Stop. Don't do that any more. Put this join condition in the ON clause of a JOIN. It's the year 2016... why are you using join syntax from the last century?
Oh, and prefix the schema on the table names. Look it up--you'll find actual performance reasons why you should do that. It's not just about the time taken to find the correct schema, but also about the execution plan cache...
one at a time - answering the last question
create a table with a bunch of null
select top (5) col
from
(
select col from table1
union
select nulCol from nullTable
) tt
order by tt.col

Left outer join funny results [duplicate]

This question already has answers here:
Why and when a LEFT JOIN with condition in WHERE clause is not equivalent to the same LEFT JOIN in ON? [duplicate]
(5 answers)
Closed 9 years ago.
please take a look at below 2 queries regarding left outer join and tell me why there are differences.
Query 1 returns 1489 rows:
SELECT distinct a.GMS_MATERIALNUMBER,a.MATERIAL_DESCRIPTION, b.LDMC
FROM [AP_GDC2_PREPARATION_TEST].[dbo].[GDM_AUTOPULL] a
left outer join [AP_GDC2_STAGING_TEST].[dbo].[CFS_DIS_LDMC] b on
a.GMS_MATERIALNUMBER = b. GMS_MATERIALNUMBER and b.SAP_COMPANY_CODE= '1715'
and a.CFS_ORGANIZATION_CODE like 'rd_kr'
Query 2 returns only 295 rows which gives the same number of rows as when i do a simple select * from a where CFS_ORGANIZATION_CODE like 'rd_kr'
SELECT distinct a.GMS_MATERIALNUMBER,a.MATERIAL_DESCRIPTION, b.LDMC
FROM [AP_GDC2_PREPARATION_TEST].[dbo].[GDM_AUTOPULL] a
left outer join [AP_GDC2_STAGING_TEST].[dbo].[CFS_DIS_LDMC] b on
a.GMS_MATERIALNUMBER = b. GMS_MATERIALNUMBER and b.SAP_COMPANY_CODE= '1715'
where a.CFS_ORGANIZATION_CODE like 'rd_kr'
Basically query 2 is the result i wanted, but my question is why query 1 does not work? how exactly does the SQL server work in the background when it comes to the ON clause in the left outer join ?
Cheers
Both are literally different.
The first query does the filtering of table before the joining of tables will take place.
The second one filters from the total result after the joining the tables is done.
Here's an example
Table1
ID Name
1 Stack
2 Over
3 Flow
Table2
T1_ID Score
1 10
2 20
3 30
In your first query, it looks like this,
SELECT a.*, b.Score
FROM Table1 a
LEFT JOIN Table2 b
ON a.ID = b.T1_ID AND
b.Score >= 20
What it does is before joining the tables, the records of table2 are filtered first by the score. So the only records that will be joined on table1 are
T1_ID Score
2 20
3 30
SQLFiddle Demo
because the Score of T1_ID is only 10. The result of the query is
ID Name Score
1 Stack NULL
2 Over 20
3 Flow 30
SQLFiddle Demo
While the second query is different.
SELECT a.*, b.Score
FROM Table1 a
LEFT JOIN Table2 b
ON a.ID = b.T1_ID
WHERE b.Score >= 20
It joins the records first whether it has a matching record on the other table or not. So the result will be
ID Name Score
1 Stack 10
2 Over 20
3 Flow 30
SQLFiddle Demo
and the filtering takes place b.Score >= 20. So the final result will be
ID Name Score
2 Over 20
3 Flow 30
SQLFiddle Demo
The difference is because you made an LEFT JOIN.
So you get all rows from your first table and all that match from your second table.
In the second query you JOIN first, and after you set your WHERE statement to reduce the result.

SQL Server - ISNULL not working on Update Query

Rather than have a NULL in a column, I want a 0 to be present.
Given the following two tables:
TABLE1
ClientID OrderCount
1 NULL
2 NULL
3 NULL
4 NULL
Table2
ClientID OrderCount
1 2
3 4
4 6
NOTE: The OrderCount column in both tables is INT datatype.
UPDATE TABLE1
SET OrderCount = ISNULL(TABLE2.OrderCount,0)
FROM TABLE1
INNER JOIN TABLE2 ON TABLE2.ClientID = TABLE1.CLIENTID
When I look at table1, I see this:
ClientID OrderCount
1 2
2 NULL
3 4
4 6
So, I thought to myself - "Obviously, I should be using NULLIF and not ISNULL", so I reversed them. Same result.
What am I doing wrong here? How do I get a 0 rather than a NULL in the column?
You need a LEFT JOIN rather than an INNER JOIN. The records that don't have a matching ClientID are not even being touched by your query.
you are using INNER JOIN but you don't have client ID 2 on table2 so your result set wont include a line with 2. Replace it with LEFT JOIN
Your join is probably filtering out rows.

Resources