USING not equivalent to ON with an OUTER JOIN - snowflake-cloud-data-platform

Why don't the following queries produce identical results?
with l as (select $1 id from values(1), (2), (3))
, r as (select $1 id from values(1), (4))
select l.*,r.* from l full outer join r using(id);
ID ID
1 1
2 2
3 3
4 4
with l as (select $1 id from values(1), (2), (3))
, r as (select $1 id from values(1), (4))
select l.*,r.* from l full outer join r on r.id = l.id;
ID ID
1 1
2
3
4
The JOIN docs say: o1 join o2 using (key_column) is equivalent to o1 join o2 on o2.key_column = o1.key_column

I guess this falls under non-standard use, so don't do it. Specifically:
To use the USING clause properly, the projection list (the list of columns and other expressions after the SELECT keyword) should be “*”.
SparkSQL produces the same results as Snowflake, but psql produces the result I expect, so... I guess its inconsistent.
scala> spark.sql("with l as (select col1 id from values(1), (2), (3)) , r as (select col1 id from values(1), (4)) select * from l full outer join r using(id)").show()
+---+
| id|
+---+
| 1|
| 3|
| 4|
| 2|
+---+
scala> spark.sql("with l as (select col1 id from values(1), (2), (3)) , r as (select col1 id from values(1), (4)) select * from l full outer join r on l.id = r.id").show()
+----+----+
| id| id|
+----+----+
| 1| 1|
| 3|null|
|null| 4|
| 2|null|
+----+----+
psql> with l as (select $1 id from values(1), (2), (3))
, r as (select $1 id from values(1), (4))
select l.*,r.* from l full outer join r using(id);
id id
1 1
2 (null)
3 (null)
(null) 4

The behavior looks strange indeed. The recommended approach is to use ON, not USING.
From the Snowflake community discussions:
According to the ANSI standard,
FROM t1
FULL OUTER JOIN t2
USING (c)
produces the following expressions: coalesce(t1.c, t2.c) as c . So subsequent references to t1.c and t2.c are actually not defined in the standard. MySQL, Postgres, and Snowflake all support these references, but use different semantics. In Snowflake, t1.c and t2.c are just aliases for c.
https://community.snowflake.com/s/question/0D50Z00008WRZBBSA5/bug-with-join-using-

Related

How to map array of ints to a single concat string?

I have these tables:
names
id | name
7 | 'a'
8 | 'b'
9 | 'c'
group_names
id | group_of_names
1 | '7'
2 | '9,8'
3 | '7,8,9'
how to build a select that returns their names instead, separated by semicolon, preserving the original order as in:
id | single_text
1 | 'a'
2 | 'c;b'
3 | 'a;b;c'
I managed to do
select g.id, string_to_array(g.group_of_names,',')::int[] from group_names g;
id | string_to_array
1 | {7}
2 | {9,8}
3 | {7,8,9}
but i don't know how to, returning several arrays, for each of them, concatenate texts based on their ids
If the order of resulting strings is irrelevant:
select g.id, string_agg(n.name, ';')
from group_names g
join names n
on n.id = any(string_to_array(g.group_of_names, ',')::int[])
group by g.id
otherwise:
select g.id, string_agg(n.name, ';' order by ord)
from names n
join (
select id, elem, ord
from group_names
cross join regexp_split_to_table(group_of_names, ',')
with ordinality as arr(elem, ord)
) g
on n.id = g.elem::int
group by g.id
Test it in db<>fiddle.
In Postgres 14+ you can use string_to_table() instead of regexp_split_to_table().
You can try this way: Using the ANY operator to check if n.id value is in the array of group_of_names.
SELECT gn.id, string_agg(n.name, ';') AS single_text
FROM names n
INNER JOIN group_names gn ON n.id::text = ANY(string_to_array(gn.group_of_names, ','))
GROUP BY gn.id
ORDER BY gn.id;
Or this way: using your query and unnest() function to expand an array to a set of rows.
SELECT gn.id, string_agg(n.name, ';')
FROM names n
INNER JOIN (SELECT g.id, unnest(string_to_array(g.group_of_names, ',')::int[]) AS name_id
FROM group_names g) AS gn ON n.id = gn.name_id
GROUP BY gn.id
ORDER BY gn.id;
SELECT g.id, string_agg(n.name,';') AS single_text
FROM group_names AS g
CROSS JOIN LATERAL regexp_split_to_table (g.group_of_names, ',') AS gn(element)
INNER JOIN names AS n
ON n.id :: text = gn.element
GROUP BY g.id

Join 2 tables with string_id instead of jointable

I want to join 2 tables but there is no join table between them... I usually use STRING_SPLIT but in this case, I can't figure it out. Maybe I'm just tired... Could you help me please ?
CREATE TABLE ##Provider
(
id INT,
p_name VARCHAR(50),
list_id_dep VARCHAR(250)
)
CREATE TABLE ##Department
(
id INT,
d_name VARCHAR(50)
)
INSERT INTO ##Provider (id, p_name, list_id_dep) VALUES
(1, 'toto', '/10/11/12/'),
(2, 'tata', '/09/');
INSERT INTO ##Department (id, d_name) VALUES
(9, 'dep9')
,(10, 'dep10')
,(11, 'dep11')
,(12, 'dep12');
What I want is :
id | p_name | d_name
--------------------------
1 | toto | dep10
1 | toto | dep11
1 | toto | dep12
2 | tata | dep09
I've tried :
select *
from ##Provider p
inner join ##Department d on STRING_SPLIT(p.list_id_dep, '/') = ???
select *
from ##Provider p
inner join STRING_SPLIT(p.list_id_dep, '/') dep ON dep.value = ???
select *
from ##Provider p, ##Department d
where (select value from STRING_SPLIT(p.list_id_dep, '/')) = d.id
select *
from ##Provider p, ##Department d
where d.id in (select value from STRING_SPLIT(p.list_id_dep, '/'))
Maybe STRING_SPLIT is not the right way to do it...
Thanks !
You need a lateral join to unnest the string - in SQL Server, this is implented with cross apply. Then, you can bring the department table with a regular join:
select p.id, p.p_name, d.d_name
from ##provider p
cross apply string_split(p.list_id_dep, '/') x
inner join ##department d on d.id = x.value
Demo on DB Fiddle:
id | p_name | d_name
-: | :----- | :-----
1 | toto | dep10
1 | toto | dep11
1 | toto | dep12
2 | tata | dep9

loop in SQL server program with Table

My expected result is quit difficult to explain so here i have shown sample data.
SourceTable: (I have alphabets in HeadNo column)
HeadNo | Start | End
---------+-----------+----------
AA | AA0000 | AA9999
AB | AB0000 | AB9999
AC | AC0000 | AC9999
AD | AD0000 | AD9999
--------------------
--------------------
------- so on ------
ZZ | ZZ0000 | ZZ9999
From this source table, I want to create kind of loop result, where each HeadNo will give return 10000 result for each, starts from 0000 to 9999.
Result should look like:
HeadNo | Actual Code
---------+---------------
AA | AA0000
AA | AB0001
AA | AC0002
AA | AD0003
--------------------
--------------------
------- so on ------
AA | AA9998
AA | AA9999
like wise for each HeadNo
ZZ | ZZ0000
ZZ | ZZ0001
ZZ | ZZ0002
ZZ | ZZ0003
--------------------
--------------------
------- so on ------
ZZ | ZZ9999
I want to merge and insert into one separate single table.
IF every row requires the values 0-9999 then you simply need to CROSS JOIN to a tally table:
WITH N AS(
SELECT *
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) N(N)),
Tally AS(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 AS I
FROM N N1
CROSS JOIN N N2 --100
CROSS JOIN N N3 --1000
CROSS JOIN N N4 --10000
)
SELECT YT.HeadNo,
YT.HeadNo + RIGHT('0000' + CONVERT(varchar(4),T.I),4) AS ActualCode
FROM YourTable YT
CROSS JOIN Tally T;
If, however, you have actual start and end ranges per HeadNo (like the example below), you'll need to use a little more logic in the JOIN:
WITH VTE AS (
SELECT *
FROM (VALUES('AA','AA0000','AA9999'),
('AB','AB0000','AB5000'), --Guesssing this is more realistic
('AC','AC1000','AC8000'),
('AD','AD0000','AD0100'),
('ZZ','ZZ0000','ZZ9999')) V(HeadNo, HeadStart, HeadEnd)),
N AS(
SELECT *
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) N(N)),
Tally AS(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 AS I
FROM N N1
CROSS JOIN N N2 --100
CROSS JOIN N N3 --1000
CROSS JOIN N N4 --10000
)
SELECT V.HeadNo,
V.HeadNo + RIGHT('0000' + CONVERT(varchar(4),T.I),4) AS ActualCode
FROM VTE V
JOIN Tally T ON T.I BETWEEN STUFF(V.HeadStart,1,2,'') AND STUFF(V.HeadEnd,1,2,'')
ORDER BY V.HeadNo,
ActualCode;
The second examples assumes that HeadNo will always have the format AA0000; if it doesn't then we're missing important information that should be included in your question.
Try below code. I used recursive CTE to obtain numebrs from 0 to 9999 and then cross joined to your HeadNo column:
;with cte as (
select 0 n
union all
select n + 1 from cte
where n < 9999
)
select HeadNo, HeadNo + right('0000' + cast(n as varchar(4)), 4) from MyTable
cross join cte option (maxrecursion 0)

How to Join to "Other" row

I have two tables. One holds Objects and the other holds Settings about each object. Not all of the rows in the Objects table have a corresponding row in the Settings table. There is a special row in the Settings table that is supposed to be used for the "Other" objects.
How can I create a join between Objects and Settings such that I get the given setting if there is one or the "Other" setting if there isn't?
For example consider the following script:
CREATE TABLE #Objects (Code nvarchar(20) not null);
CREATE TABLE #Settings (Code nvarchar(20) not null, Value int not null);
INSERT INTO #Objects
VALUES
('A'),
('B'),
('D')
INSERT INTO #Settings
VALUES
('A', 1),
('B', 2),
('C', 3),
('Other', 4)
SELECT
#Objects.Code,
#Settings.Value
FROM
#Objects
JOIN #Settings
ON #Objects.Code = #Settings.Code
OR #Settings.Code = 'Other'
DROP TABLE #Settings, #Objects
I'm wanting to get this:
Code | Value
---- | -----
A | 1
B | 2
D | 4
What I'm actually getting is:
Code | Value
----- | -----
A | 1
A | 4
B | 2
B | 4
D | 4
You can do this with an APPLY:
SELECT o.Code, s.Value
FROM #Objects o
CROSS APPLY (
SELECT TOP 1 *
FROM #Settings s
WHERE s.Code = o.Code or s.Code = 'Other'
ORDER BY case when s.Code = o.Code then 0 else 1 end
) s
For fun: a hybrid from answers by Gurv, jyao and SqlZim, which are all variations on the same basic theme:
SELECT o.Code, s2.Value
FROM #Objects o
LEFT JOIN #Settings s1 on s1.Code = o.Code
INNER JOIN #Settings s2 on s2.Code = coalesce(s1.Code, 'Other')
So far, this approach (LEFT JOIN + the INNER JOIN ON COALESCE() ) is my favorite option.
Note that this only works if there can be only one Settings record per Object record. If that ever changes, the APPLY answer still works, but other answers here might not work.
Another way is to use CTE to add an additional column [Alternative_code] for [#Object] table that has value "Other" for [Code] not existing in [#Settings]
and then using this CTE to join with #Settings table as shown below
; with c as (
select alternative_Code = isnull(s.code, 'Other'), o.Code
from #Objects o
left join #Settings s
on o.Code = s.Code)
select c.Code, s.value
from c
inner join #Settings s
on c.alternative_Code = s.Code
Using a left join to get null where o.Code has no match in #Settings
, and using coalesce() to return the designated replacement value
from #Settings when s.Value is null.
You could use isnull() instead of coalesce, the result would be the same in this instance.
I am not sure if this acceptable, but it returns the correct results:
select
o.Code
, coalesce(s.Value,x.Value) as Value
from #Objects o
left join #Settings s
on o.Code = s.Code
cross join (
select top 1 value
from #Settings
where Code = 'Other'
) x
rextester demo: http://rextester.com/EBUG86037
returns:
+------+-------+
| Code | Value |
+------+-------+
| A | 1 |
| B | 2 |
| D | 4 |
+------+-------+
In the form #RBarryYoung prefers:
select
o.Code
, coalesce(s.Value,x.Value) as Value
from #Objects o
left join #Settings s
on o.Code = s.Code
inner join #Settings x
on x.Code = 'Other'
This is more concise (saves you many keystrokes) and generates the same execution plan as my initial answer. Whether it is more or less clear about what it is doing is up to you, I like both.
If there is going to be one "Other" value then you can just do the join twice - a left join and another one which is effectively a cross join:
select o.Code,
coalesce(s.Value, s2.value) as value
from #Objects o
left join #Settings s on o.Code = s.Code
join #Settings s2 on s2.Code = 'Other'

Filtering the distinct rows from Table

I have a table with two columns schema as:
ID1,ID2
Values are as:
x y
y x
a b
b a
I just want resultset as a whole like:
x y
a b
I want to remove duplicate
Need the sql query for the same.
One of the ways to achieve this is to use LEFT JOIN like this.
SQL Fiddle
Query
SELECT T1.ID1,T1.ID2
FROM YourTable T1
LEFT JOIN YourTable T2
ON T1.ID1 = T2.ID2
AND T1.ID2 = T2.ID1
AND T2.ID1 < T2.ID2
WHERE T2.ID1 IS NULL
Output
| ID1 | ID2 |
|-----|-----|
| x | y |
| a | b |
I suggest you to use EXISTS like this:
SELECT *
FROM yourTable t
WHERE NOT EXISTS(SELECT 1 FROM yourTable ti
WHERE t.ID1 = ti.ID2 AND t.ID2 = ti.ID1
AND ti.ID1 > ti.ID2)

Resources