sql - match two of the same values in different column positions - snowflake-cloud-data-platform

I am looking to join two different tables on the id, and need to extract unique names out of each table; if one table has a certain name but the other doesn't, there should be one value and one null. This should be vice versa as well.
With joins, the current output looks like this:
id name_1 name_2
1 max steph
1 max john
1 john chris
1 john chris
1 chris steph
1 chris null
1 null max
1 null null
1 tony john
1 tony max
expected output:
id name_1 name_2
1 max max
1 john john
1 chris chris
1 null steph
1 tony null
current sql:
select
table1.id,
table1.name as name_1,
table2.name as name_2
from table1
left join table2
on table1.id = table2.id
(snowflake)

SELECT
NVL(d1.id, d2.id) as id,
d1.name as name_1,
d2.name as name_2
FROM (
SELECT DISTINCT id,name FROM table1
) AS d1
FULL OUTER JOIN (
SELECT DISTINCT id,name FROM table2
) AS d2
ON d1.id = d2.id AND d1.name = d2.name
ORDER BY 1, (d1.name,d2.name)
This takes the distinct id,name pairs from both table, then full outer joins those sets of values. Thus if the id,name are in both they match. And if they don't match they are still keep.
So with these CTE's providing the fake data:
WITH table1(id,name) AS (
select * from values (1,'aa'),(1,'ab'),(2,'ba')
), table2(id,name) AS (
select * from values (1,'aa'),(1,'ac'),(2,'ba'),(2,'bb')
)
ID
NAME_1
NAME_2
1
aa
aa
1
ab
null
1
null
ac
2
ba
ba
2
null
bb

Following can be used for this -
with cte as
(
select distinct t1.id,name_1 from t1)
select distinct ifnull(t2.id,cte.id) id,
cte.name_1,
t2.name_2
from t2 full outer join cte
ON cte.id=t2.id
and cte.name_1 = t2.name_2
order by cte.name_1;
+----+--------+--------+
| ID | NAME_1 | NAME_2 |
|----+--------+--------|
| 1 | chris | chris |
| 1 | john | john |
| 1 | max | max |
| 1 | tony | NULL |
| 1 | NULL | steph |
+----+--------+--------+

Add a WHERE clause.
select
table1.id,
table1.name as name_1,
table2.name as name_2
from table1
left join table2
WHERE table1.name = table2.name
OR table1.name is null
OR table2.name is null
on table1.id = table2.id
If you just need a list of unique names
select distinct name from table1
union
select distinct name from table2

Simeons answer is the way to go since snowflake supports full outer joins. But for those of you that use a relational database that lacks support for full outer joins, and have the same issue, this approach can be an alternative:
select id,
if(instr(group_concat(tb), 1), name, NULL) name_1,
if(instr(group_concat(tb), 2), name, NULL) name_2
from(
select id, name, 1 tb from table1
union
select id, name, 2 tb from table2
) a
group by id, name
order by name
The result:
| id | name_1 | name_2 |
| --- | ------ | ------ |
| 1 | chris | chris |
| 1 | john | john |
| 1 | max | max |
| 1 | null | steph |
| 1 | tony | null |
Fake data:
CREATE TABLE table1 (
id int(11),
name varchar(50)
);
CREATE TABLE table2 (
id int(11),
name varchar(50)
);
INSERT INTO table1 VALUES
(1, 'max'),
(1, 'john'),
(1, 'chris'),
(1, 'tony');
INSERT INTO table2 VALUES
(1, 'steph'),
(1, 'john'),
(1, 'chris'),
(1, 'max');
And a dbfiddle: https://www.db-fiddle.com/f/gQ4U7hu2S2EyFEtZrapqdu/6

Related

Combine columns from tables into single table

I have four tables with data I want to put the respective total columns from each in a different column. However, I would like to match on dealerId. So if there is a dealerId in Table 1 and Table 3 that are the same, they should be a single row.
Table 1
dealerId | t1 Total Amount
---------+---------------
1 | 123
2 | 456
Table 2
dealerId | t2 Total Amount
---------+----------------
3 | 111
4 | 222
5 | 333
Table 3
dealerId | t3 Total Amount
---------+----------------
1 | 555
3 | 565
6 | 888
Table 4
dealerId | t4 Total Amount
---------+----------------
1 | 88
2 | 99
3 | 11
Desired Outcome
dealerId | t1Total Amount | t2Total Amount | t3 Total Amount | t4 Total Amount
---------+----------------+----------------+-----------------+-----------------
1 | 123 | null | 555 | 88
2 | 456 | null | null | 99
3 | null | 111 | 565 | 11
4 | null | 222 | null | null
5 | null | 333 | null | null
6 | null | null | 888 | null
I have basically created views (I don't know if this is the correct term for it) and tried to UNION ALL them, but this only gives me a single column with all the totals.
SELECT *
FROM
(
SELECT o.DealerId, Sum(oi.Amount) as T1_Total
FROM ....
) AS T1
UNION ALL
SELECT *
FROM
(
SELECT o.DealerId, Sum(oi.Amount) as T2_Total
FROM ....
) AS T2
UNION ALL
...
-- repeat for T3 and T4
You can also use the pivot operator:
select dealerId, T1, T2, T3, T4
from (
select dealerId, 'T1' as Src, "t1 Total Amount" as Amt from T1
union all
select dealerId, 'T2' , "t2 Total Amount" from T2
union all
select dealerId, 'T3' , "t3 Total Amount" from T3
union all
select dealerId, 'T4' , "t4 Total Amount" from T4
) vert
pivot (sum(Amt) for Src in (T1,T2,T3,T4)) horiz
Results:
dealerId T1 T2 T3 T4
----------- ----------- ----------- ----------- -----------
1 123 NULL 555 88
2 456 NULL NULL 99
3 NULL 111 565 11
4 NULL 222 NULL NULL
5 NULL 333 NULL NULL
6 NULL NULL 888 NULL
Try doing something like this:
with all_dealer_ids AS (
SELECT DISTINCT dealerId
FROM Table1
UNION
SELECT DISTINCT dealerId
FROM Table2
UNION
SELECT DISTINCT dealerId
FROM Table3
UNION
SELECT DISTINCT dealerId
FROM Table4
)
SELECT adi.dealerId, SUM(t1.TotalAmount) As T1TotalAmount, SUM(t2.TotalAmount) As
T2TotalAmount, SUM(t3.TotalAmount) AS T3TotalAmount, SUM(t4.TotalAmount) AS T4TotalAmount
FROM all_dealer_ids adi
LEFT JOIN Table1 t1
ON adi.dealerId = t1.dealerId
LEFT JOIN Table2 t2
ON adi.dealerId = t2.dealerId
LEFT JOIN Table3 t3
ON adi.dealerId = t3.dealerId
LEFT JOIN Table4 t4
ON adi.dealerId = t4.dealerId
GROUP BY adi.dealerId
ORDER BY adi.dealerId ASC
You'll want to use JOIN instead of UNION to accomplish what you're looking for. UNIONs are typically used to stack data, which is why you are seeing all your data in one column. You can try something like this to accomplish what you are looking for.
SELECT COALESCE(t1_totals.dealerID, t2_totals.dealerID, t3_totals.dealerID, t4_totals.dealerID) AS dealerId
, t1_totals.t1_total_amount
, t2_totals.t2_total_amount
, t3_totals.t3_total_amount
, t4_totals.t4_total_amount
FROM (
SELECT dealerID, sum(amount) AS t1_total_amount FROM t1 GROUP BY t1.dealerID
) AS t1_totals
FULL JOIN (
SELECT dealerID, sum(amount) AS t2_total_amount FROM t2 GROUP BY t2.dealerID
) AS t2_totals
ON t2_totals.dealerID = t1_totals.dealerID
FULL JOIN (
SELECT dealerID, sum(amount) AS t3_total_amount FROM t3 GROUP BY t3.dealerID
) AS t3_totals
ON t3_totals.dealerID = t1_totals.dealerID
FULL JOIN (
SELECT dealerID, sum(amount) AS t4_total_amount FROM t4 GROUP BY t4.dealerID
) AS t4_totals
ON t4_totals.dealerID = t1_totals.dealerID
ORDER BY dealerId

Fetching data from two tables in oracle

I have two result sets :
Set 1:
STUDENT| COUNT
------ | ------
mohit | 4
Rohit | 2
Tanvi | 2
Jhanvi | 1
Set 2:
STUDENT| COUNT_STAR
------ | ------
mohit | 2
Rohit | 3
Tanvi | 1
Arjun | 1
Abhay | 3
Abhi | 1
Expected Result Set :
STUDENT| COUNT | COUNT_STAR
------ | ------ | ----------
mohit | 4 | 2
Rohit | 2 | 3
Tanvi | 2 | 1
Arjun | na | 1
Abhay | na | 3
Abhi | na | 1
Jhanvi | 1 | na
Can someone help me with the SQL Query for this ?
you need a union for get the distinct name from both the table
and left join for get the values for count an count_star
select T.STUDENT , table1.count, table2.count_star
from (
select STUDENT
from table1
UNION
select STUDENT
from table2
) T
left join table1 on table1.student = t.student
left join table2 on table1.student = t.student
Use a FULL OUTER JOIN to join two overlapping result sets:
select coalesce(table1.student, table2.student) as student
, nvl( table1.count, 'na') as count
, nvl( table2.star_count, 'na') as star_count
from table1
full outer join table2
on table1.student = table2.student
you can use FULL OUTER JOIN to get the required result-
SELECT DECODE (a.STUDENT, NULL, b.STUDENT, a.STUDENT) STUDENT,
a.COUNT,
b.count_star
FROM table1 a FULL OUTER JOIN table2 b ON a.STUDENT = b.STUDENT;
Hope this helps.
Following SQL is tested with Oracle 12G:
SELECT COALESCE (T1.STUDENT, T2.STUDENT) AS STUDENT,
DECODE (T1.COUNT, NULL, 'na', T1.COUNT) COUNT,
DECODE (T2.COUNT_STAR, NULL, 'na', T2.COUNT_STAR) COUNT_STAR
FROM TABLE1 T1
FULL OUTER JOIN TABLE2 T2 ON T1.STUDENT = T2.STUDENT;

SP to update 3rd table using data in first 2 tables

For e.g. I have below table1 and table3. The 'Counts' field in table2 should be updated based on valuess field in table1 and table3. i.e. 23 appears 4 times in table1 and table3 and 45 appears once. Table2 should be updated with that count.
table1
Id | Data | Valuess
1 | rfsd | 23
2 | fghf | 45
3 | rhhh | 23
table3
Id | Data | Valuess
1 | rfsd | 23
2 | tfgy | 23
table2
Id | Fields | Counts
1 | 23 | 4
2 | 45 | 1
I am using the below stored procedure to achieve this.
WITH t13 AS (
SELECT Id, Data, Valuess FROM Table1 UNION ALL SELECT Id, Data, Valuess FROM Table3),
cte AS (SELECT Valuess,COUNT(*) AS Count2 FROM t13 GROUP BY Valuess)
UPDATE t2
SET t2.Counts = cte.Count2
FROM Table2 t2 JOIN cte ON t2.Fields = cte.Valuess;
QUESTION
Now instead of above table data, i have below table data....
table1
Id | Data | Valuess
1 | rfsd | 004561
2 | fghf | 0045614
3 | rhhh | adcwyx
table3
Id | Data | Valuess
1 | rfsd | 0045614
2 | tfgy | 004561
table2
Id | Fields | Counts
1 | 0045614 | 4
2 | adcwyxv | 1
So here we have alphanumeric data in valuess field of table1 and table3. Also we have data like '004561' and '0045614'
I want to clip off the 7th element of the field and compare it with clipping off 7th element in the table 3. i.e. 004561, 004561 and adcwyx will be taken from table1. 004561 and 004561 will be taken from table3 and compared with 004561 and adcwyx of table2 ( we need to clip off 7th element in table2 first) and then compare.
The final result should be as shown in table2.
SUBSTRING should do it.
WITH t13 AS (
SELECT Id, Data, SUBSTRING(Valuess,1,6) AS [Values]
FROM Table1
UNION ALL
SELECT Id, Data, SUBSTRING(Valuess,1,6) AS [Values]
FROM Table3
)
, cte AS (
SELECT [Values],COUNT(*) AS Count2
FROM t13 GROUP BY [Values]
)
UPDATE t2
SET t2.Counts = cte.Count2
FROM Table2 t2 JOIN cte ON SUBSTRING(t2.Fields,1,6) = cte.[Values];

SQL Server 2012 - Looking for duplicates with differences

In SQL Server 2012, I have a table like this:
Id | AccountID | Accession | Status
----------------------------------------
1 | 1234567 | ABCD | F
2 | 1234567 | ABCD | F
3 | 2345678 | BCDE | F
4 | 8765432 | BCDE | F
5 | 3456789 | CDEF | F
6 | 9876543 | CDEF | A
I need to find rows that have the same Accession and a Status of "F", but a different AccountID.
I need a query that would return:
Id | AccountID | Accession | Status
----------------------------------------
3 | 2345678 | BCDE | F
4 | 8765432 | BCDE | F
1 and 2 wouldn't be returned because they have the same AccountID. 5 and 6 wouldn't be returned because the status on 6 is "A" and not "F".
You could do something like this.
;WITH NonDupAccountIDs AS
(
SELECT AccountID,Accession, Status
FROM MyTable
WHERE Status = 'F'
GROUP BY AccountID,Accession, Status
HAVING COUNT(Id) = 1
)
,DupAccessions AS
(
SELECT Accession
FROM MyTable
WHERE Status = 'F'
GROUP BY Accession
HAVING COUNT(AccountID) > 1
)
select a.AccountID, a.Accession, a.Status
FROM NonDupAccountIDs a
INNER JOIN DupAccessions b
ON a.Accession = b.Accession
Another alternative
Declare #Table table (id int,AccountID varchar(25),Accession varchar(25),Status varchar(25))
Insert into #Table (id , AccountID , Accession , Status) values
(1, 1234567,'ABCD','F'),
(2, 1234567,'ABCD','F'),
(3, 2345678,'BCDE','F'),
(4, 8765432,'BCDE','F'),
(5, 3456789,'CDEF','F'),
(6, 9876543,'CDEF','A')
Select A.*
from #Table A
Join (
Select Accession
From #Table
Where Status='F'
Group By Accession
Having Min(Accession)=Max(Accession)
and count(Distinct AccountID)>1
) B on a.Accession=B.Accession
Returns
id AccountID Accession Status
3 2345678 BCDE F
4 8765432 BCDE F
This works as well. If there are multiple sets of duplicates, this only returns one with the highest ID. Example
John Cappelletti had a great solution as well, his returns all duplicated values if there exists any incongruity. Example
I had to add some more data to see what would happen. You should decide how you will treat these occurrences.
select
max(ID) ID,AccountID, Accession
from p where Status = 'F'
group by AccountID, Accession
having
(select count(Accession) from (select max(ID) ID,AccountID, Accession from p where Status = 'F' group by AccountID, Accession) f where f.accession = p.accession)>1
;
SELECT t2.Id, t1.AccountID, t1.Accession, t1.Status
FROM TABLE_NAME t2
INNER JOIN (
SELECT AccountID, Accession, Status
FROM TABLE_NAME
GROUP BY Status, Accession, AccountID
) t1
ON t1.AccountID = t2.AccountID
Might need to play with this but should get you close. Remember to replace TABLE_NAME with your table.

SELECT rows from Table1 having identical values on columns and Table2 Column > N

This is simple. I have two tables. I need to select rows from Table1 which have same 'Customer' and in Table2 'yearmm' is bigger than 2015001.
Table1
id | Customer | yearmmm |
----------------------------
10 | 123456 | 2015001 |
11 | 456789 | 2015001 |
20 | 111111 | 2015001 |
21 | 222222 | 2015001 |
44 | 4444 | 2015001 |
Table2
id | Customer | yearmmm |
----------------------------
10 | 123456 | 2015001 |
11 | 456789 | 2015002 |
20 | 111111 | 2015003 |
21 | 222222 | 2010001 |
333 | 333 | 2015004 |
Wonder if this works:
SELECT * FROM Table1 WHERE Customer IN
(SELECT Customer FROM Table2 WHERE yearmmm > '2015001')
Desired result:
11 | 456789 | 2015002 |
20 | 111111 | 2015003 |
You can use EXISTS:
SELECT t1.*
FROM Table1 t1
WHERE EXISTS
(
SELECT 1 FROM Table2 t2
WHERE t1.Customer = t2.Customer
AND t2.yearmmm > '20150101'
)
You have other options like INNER JOIN or IN.
Well, no, that will not work. You're ultimately selecting the Customer and yearmmm from Table1 based on values in Table2. Yet your desired results show yearmmm values that exist in Table2.
Based on your desired results it seems like you just want this:
SELECT * FROM Table2 WHERE yearmmm > '2015001'
EDIT: If you do in fact need more data from Table1, consider:
SELECT t1.id, t1.Customer, t2.yearmmm, another_other_fields
FROM Table1 t1 INNER JOIN Table2 t2 ON t1.id = t2.id
WHERE t2.yearmmm > '2015001'
You can use:
SELECT
t1.*
FROM
Table1 t1
INNER JOIN Table2 t2 ON
(t1.Customer = t2.Customer)
AND (t2.yearmmm > '2015001');
Or
SELECT
t1.*
FROM
Table1 t1
INNER JOIN Table2 t2 ON
(t1.Customer = t2.Customer)
WHERE
(t2.yearmmm > '2015001');

Resources