Union Two Tables and Overwrite Preexisting Rows - sql-server

I'm looking to union two tables together, but if there are any duplicate records where "Email" from Table 1 matches "Email" from Table 2, then data from Table 2 will be extracted. Is this function possible?
Table 1
Name | Email | Status
A | a#a.com | 1
B | b#b.com | 2
C | c#c.com | 1
Table 2
Name | Email | Status
C | c#c.com | 2
D | d#d.com | 1
E | e#e.com | 2
Resulting Table
Name | Email | Status
A | a#a.com | 1
B | b#b.com | 2
C | c#c.com | 2
D | d#d.com | 1
E | e#e.com | 2

One approach to this problem is to do a SELECT against table1 with a WHERE NOT IN against table2 to filter the rows selected from table1 so that none of the rows that exist in table2 will be part of that result -- then that result can be UNION'd against table2.
Here's an example (TableA and TableB in my code):
declare #TableA as Table ( Name VarChar(20), Email VarChar(20), Status INT );
declare #TableB as Table ( Name VarChar(20), Email VarChar(20), Status INT );
insert into #TableA ( Name, Email, Status ) values
( 'A', 'a#a.com', 1 ),
( 'B', 'b#b.com', 2 ),
( 'C', 'c#c.com', 1 )
insert into #TableB ( Name, Email, Status ) values
( 'C', 'c#c.com', 2 ),
( 'D', 'd#d.com', 1 ),
( 'E', 'e#e.com', 2 )
SELECT * FROM #TableA WHERE Email NOT IN ( SELECT DISTINCT Email FROM #TableB )
UNION
SELECT * FROM #TableB

Related

How to append metadata to a grouped selection by primary key

I have a list of favorites like:
Sample Data
| key | item_id | list_name | customer_id |meta |
|-----|---------|-----------|--------------|---------------|
| 1 | A-11 | aa11 | 001 | unique-data-1 |
| 2 | A-11 | bb22 | 001 | unique-data-2 |
| 3 | A-26 | cc33 | 001 | unique-data-3 |
| 4 | A-28 | aa11 | 002 | unique-data-4 |
| 5 | J-52 | aa11 | 001 | unique-data-5 |
| 6 | X-53 | aa11 | 001 | unique-data-6 |
Desired Output
for #item_id nvarchar(20) = 'A-11'
| key | isFavorited | list_name | meta |
|-----|-------------|-----------|---------------|
| 1 | Y | aa11 | unique-data-1 |
| 2 | Y | bb22 | unique-data-2 |
| 3 | N | cc33 | unique-data-3 |
And would like to return a selection of all available lists, as well as whether or not a particular item is part of that list, with its meta data.
declare #item_id nvarchar(20) = 'A-11'
declare #customer_id nvarchar(20) = 001
select
[key],
[isFavorited] = max(case when [item_id] = #item_id then 'Y' else 'N' end)
[list_name]
[meta]
from favorites
where customer_id = #customer_id
group by [list_name], [key], [meta]
Issues when trying various methods:
The issue I'm having is that since the meta is unique the group by destroys the uniqueness of the select
A cross apply like the following doesn't apply the correct meta based on a matching key.
cross apply (
select top 1
[meta]
from favorites
where customer_id = #customer_id
)
When selecting by row number, the actual key to join back to is lost, so I'm unable to join the meta.
"noRow" = row_number() over(order by h_po_no asc)
I'd like to
Pass in an item_id and customer_id
Return all lists for that customer
Get favorite status for each list of passed in item_id
An item is flagged favorite if it matches both list_name and item_id for a given customer_id
Get row primary key and meta data
How can I return a distinct selection of list_name, isFavorite status, key, and it's meta?
To return the desired output you don't need any aggregation at all. A simple case expression and a where clause will accomplish this.
declare #Favorites table
(
MyKey int
, item_id varchar(10)
, list_name varchar(10)
, customer_id varchar(10)
, meta varchar(20)
)
insert #Favorites values
(1, 'A-1', 'list-1', '001', 'unique-data-1')
, (2, 'A-1', 'list-2', '001', 'unique-data-2')
, (3, 'A-2', 'list-3', '001', 'unique-data-3')
, (4, 'A-2', 'list-1', '002', 'unique-data-1')
select *
from #Favorites
declare #item_id nvarchar(20) = 'A-1'
, #customer_id nvarchar(20) = '001'
select f.MyKey
, isFavorited = case when f.item_id = #item_id then 'Y' else 'N' end
, listName = f.list_name
, f.meta
from favorites f
where f.customer_id = #customer_id
order by f.MyKey
Test Parameters
declare #item_id nvarchar(20) = 'A-11'
declare #customer_id nvarchar(20) = 001
Solution
drop table if exists #tmp
;with cte as (
select
[key],
[list_name],
[rn] = row_number() over (partition by list_name order by list_name desc)
from favorites
where customer_id = #customer_id
group by [list_name], [key], [meta]
)
select *
into #tmp
from cte
where [rn] = 1
select
[i],
[json],
#t.[tmp]
from #tmp #t
inner join (
select
[isFavorited] = max(case when [item_id] = #item_id then 'Y' else 'N' end)
[list_name]
from favorites
where customer_id = #customer_id
group by [list_name]
) j
on j.list_name = #t.list_name

Track the changes of a few columns in an existing table leveraging primary keys?

I'm currently trying to track the changes of a few columns (let's call them col1 & col2) in a SQL Server table. The table is not being "updated/inserted/deleted" over time; new records are just being added to it (please see below 10/01 vs 11/01).
My end-goal would be to run a SQL query or stored procedure that would highlight the changes overtime using primary keys following the framework:
PrimaryKey | ColumnName | BeforeValue | AfterValue | Date
e.g.
Original table:
+-------+--------+--------+--------+
| PK1 | Col1 | Col2 | Date |
+-------+--------+--------+--------+
| 1 | a | e | 10/01 |
| 1 | b | e | 11/01 |
| 2 | c | e | 10/01 |
| 2 | d | f | 11/01 |
+-------+--------+--------+--------+
Output:
+--------------+--------------+---------------+--------------+--------+
| PrimaryKey | ColumnName | BeforeValue | AfterValue | Date |
+--------------+--------------+---------------+--------------+--------+
| 1 | Col1 | a | b | 11/01 |
| 2 | Col1 | c | d | 11/01 |
| 2 | Col2 | e | f | 11/01 |
+--------------+--------------+---------------+--------------+--------+
Any help appreciated.
Here is some code which is a bit clunky, but seems to work, Basically for each row I try and find an earlier row with a different value. This is done twice, once for Col1 and once for Col2.
To make it work I had to add a unique PK field, which I don't know whether you have or not, you can easily add as an identify field, either to your real table, or to the table used for the calculations.
declare #TestTable table (PK int, PK1 int, Col1 varchar(1), Col2 varchar(1), [Date] date)
insert into #TestTable (PK, PK1, Col1, Col2, [Date])
select 1, 1, 'a', 'e', '10 Jan 2018'
union all select 2, 1, 'b', 'e', '11 Jan 2018'
union all select 3, 2, 'c', 'e', '10 Jan 2018'
union all select 4, 2, 'd', 'f', '11 Jan 2018'
select T1.[Date], T1.PK1, 'Col1', T2.Col1, T1.Col1
from #TestTable T1
inner join #TestTable T2 on T2.PK = (
select top 1 PK
from #TestTable T21
where T21.PK1 = T1.PK1 and T21.Col1 != T1.Col1 and T21.[Date] < T1.[Date]
order by T21.[Date] desc
)
union all
select T1.[Date], T1.PK1, 'Col2', T3.Col2, T1.Col2
from #TestTable T1
inner join #TestTable T3 on T3.PK = (
select top 1 PK
from #TestTable T31
where T31.PK1 = T1.PK1 and T31.Col2 != T1.Col2 and T31.[Date] < T1.[Date]
order by T31.[Date] desc
)
order by [Date], PK1

SQL Server stored procedure - how to sort the records with tree format?

I have a table with these records:
+----+-------------+---------+
| ID | Name | ParentID|
+----+-------------+---------+
| 1 | Item 1 | -1 |
| 2 | Item 2 | -1 |
| 3 | Item 1.1 | 1 |
| 4 | Item 1.2 | 1 |
| 5 | Item 2.1 | 2 |
| 6 | Item 1.1.1 | 3 |
| 7 | Item 1.2.1 | 4 |
| 8 | Item 2.2 | 2 |
| 9 | Item 1.1.1.1| 6 |
+----+-------------+---------+
I want select the records with tree format.
How to get result like below table with a stored procedure? The values of Name column are temporary, it could be ANY WORD.
+----+-------------+---------+
| ID | Name | ParentID|
+----+-------------+---------+
| 1 | Item 1 | -1 |
| 3 | Item 1.1 | 1 |
| 6 | Item 1.1.1 | 3 |
| 9 | Item 1.1.1.1| 6 |
| 4 | Item 1.2 | 1 |
| 7 | Item 1.2.1 | 4 |
| 2 | Item 2 | -1 |
| 5 | Item 2.1 | 2 |
| 8 | Item 2.2 | 2 |
+----+-------------+---------+
Sorry, I'm beginner to stored procedures. So I don't know how to get result like above table.
Thank you for reading
I think parent nodes must be before its child nodes.
Here is my query
DECLARE #SampleData AS TABLE( ID int, Name varchar(20) , ParentID int)
INSERT INTO #SampleData VALUES ( 1 ,' Item 1', -1 )
INSERT INTO #SampleData VALUES ( 2 ,' Item 2' , -1 )
INSERT INTO #SampleData VALUES ( 3 ,' Item 1.1' , 1 )
INSERT INTO #SampleData VALUES ( 4 ,' Item 1.2' , 1 )
INSERT INTO #SampleData VALUES ( 5 ,' Item 2.1' , 2 )
INSERT INTO #SampleData VALUES ( 6 ,' Item 1.1.1' , 4 )
INSERT INTO #SampleData VALUES ( 7 ,' Item 1.2.1' , 6 )
INSERT INTO #SampleData VALUES ( 8 ,' Item 2.2' , 2 )
;with cte as (
select t.Id, t.Name, t.ParentID, 1 as lev, t.Id AS RootId
from #SampleDAta t
where t.ParentID = -1
union all
select t.ID, t.Name,t.ParentID,cte.lev +1, cte.RootId
from cte
INNER JOIN #SampleDAta t on t.ParentID = cte.Id
)
SELECT c.Id, c.Name, c.ParentID FROM cte c
ORDER BY c.RootId, c.lev
OPTION (MAXRECURSION 0)
Result:
If you want to sort like a depth-search-tree, I can do it by a function
CREATE TABLE SampleData ( ID int, Name varchar(20) , ParentID int)
INSERT INTO SampleData VALUES ( 1 ,' Item 1', -1 )
INSERT INTO SampleData VALUES ( 2 ,' Item 2' , -1 )
INSERT INTO SampleData VALUES ( 3 ,' Item 1.1' , 1 )
INSERT INTO SampleData VALUES ( 4 ,' Item 1.2' , 1 )
INSERT INTO SampleData VALUES ( 5 ,' Item 2.1' , 2 )
INSERT INTO SampleData VALUES ( 6 ,' Item 1.1.1' , 3 )
INSERT INTO SampleData VALUES ( 7 ,' Item 1.2.1' , 4 )
INSERT INTO SampleData VALUES ( 8 ,' Item 2.2' , 2 )
Now Create function
CREATE FUNCTION DisplayTree
(
#RootId int
)
RETURNS
#result TABLE (
ID int, Name varchar(20) , ParentID int
)
AS
BEGIN
DECLARE #Temp AS TABLE
(
ID int, Name varchar(20) , ParentID int
)
INSERT INTO #Temp SELECT * FROM SampleData WHERE ParentID = #RootId
WHILE(EXISTS(SELECT 1 FROM #Temp t))
BEGIN
DECLARE #CurrentRootId int
SELECT TOP 1 #CurrentRootId = t.ID FROM #Temp t ORDER BY t.ID ASC
INSERT INTO #result SELECT * FROM #Temp t WHERE t.ID = #CurrentRootId
DELETE FROM #Temp WHERE ID = #CurrentRootId
INSERT INTO #result SELECT * FROM dbo.DisplayTree(#CurrentRootId)
END
RETURN ;
END
GO
AND Execute function
SELECT * FROM dbo.DisplayTree(-1)
You need to split the columns to N number of columns and then need to use Order By those columns.
Schema:
CREATE TABLE #TAB(ID INT, Name VARCHAR(20),ParentID INT)
INSERT INTO #TAB
SELECT 1, 'Item 1', -1
UNION ALL
SELECT 2, 'Item 2' , -1
UNION ALL
SELECT 3, 'Item 1.1' , 1
UNION ALL
SELECT 4, 'Item 1.2' , 1
UNION ALL
SELECT 5, 'Item 2.1' , 2
UNION ALL
SELECT 6, 'Item 1.1.1', 4
UNION ALL
SELECT 7, 'Item 1.2.1', 6
UNION ALL
SELECT 8, 'Item 2.2', 2
I converted Name column to XML and then I made columns from Tags using XML method value
;WITH CTE AS
(
SELECT
ID,
Name,
ParentID,
CAST('<M>'+REPLACE (REPLACE(Name,' ','.'),'.','</M><M>')+'</M>' AS XML)
AS XML_SPLT
FROM #TAB
)
SELECT
ID,
Name,
ParentID,
XML_SPLT.value('/M[1]', 'varchar(50)') As P0,
XML_SPLT.value('/M[2]', 'int') As P1,
XML_SPLT.value('/M[3]', 'int') As P2,
XML_SPLT.value('/M[4]', 'int') As P3
FROM CTE
ORDER BY P0,P1,P2,P3
GO
This will give you the order as you are expecting.
+----+------------+----------+------+----+------+------+
| ID | Name | ParentID | P0 | P1 | P2 | P3 |
+----+------------+----------+------+----+------+------+
| 1 | Item 1 | -1 | Item | 1 | NULL | NULL |
| 3 | Item 1.1 | 1 | Item | 1 | 1 | NULL |
| 6 | Item 1.1.1 | 4 | Item | 1 | 1 | 1 |
| 4 | Item 1.2 | 1 | Item | 1 | 2 | NULL |
| 7 | Item 1.2.1 | 6 | Item | 1 | 2 | 1 |
| 2 | Item 2 | -1 | Item | 2 | NULL | NULL |
| 5 | Item 2.1 | 2 | Item | 2 | 1 | NULL |
| 8 | Item 2.2 | 2 | Item | 2 | 2 | NULL |
+----+------------+----------+------+----+------+------+

Update postgres table to squash duplicate values in second table

I have a postgresql schema with two tables:
tableA: tableB:
| id | username | | fk_id | resource |
| 1 | user1 | | 2 | item1 |
| 2 | user1 | | 1 | item3 |
| 3 | user1 | | 1 | item2 |
| 4 | user2 | | 4 | item5 |
| 5 | user2 | | 5 | item8 |
| 6 | user3 | | 3 | item9 |
The foreign key fk_id in tableB references id in tableA.
How can I update all of the foreign key id's of tableB to point to the lowest entry for a unique username in tableA?
update table_b b
set fk_id = d.id
from table_a a
join (
select distinct on (username) username, id
from table_a
order by 1, 2
) d using(username)
where a.id = b.fk_id;
Test it here.
The query used inside the update gives actual_id, username, desired_id:
select a.id actual_id, username, d.id desired_id
from table_a a
join (
select distinct on (username) username, id
from table_a
order by 1, 2
) d using(username)
actual_id | username | desired_id
-----------+----------+------------
1 | user1 | 1
2 | user1 | 1
3 | user1 | 1
4 | user2 | 4
5 | user2 | 4
6 | user3 | 6
(6 rows)
We define your tables:
CREATE TABLE tableA (id, username) AS
SELECT * FROM
(
VALUES
(1, 'user1'),
(2, 'user1'),
(3, 'user1'),
(4, 'user2'),
(5, 'user2'),
(6, 'user2')
) AS x ;
CREATE TABLE tableB (fk_id, resource) AS
SELECT * FROM
(
VALUES
(2, 'item1'),
(1, 'item3'),
(1, 'item2'),
(4, 'item5'),
(5, 'item8'),
(3, 'item9')
) AS x ;
With that info, you can create a (virtual) conversion table, and use it to update your data:
-- Using tableA, make a new table with the
-- minimum id for every username
WITH username_to_min_id AS
(
SELECT
min(id) AS min_id, username
FROM
tableA
GROUP BY
username
)
-- Convert the previous table to a id -> min_id
-- conversion table
, id_to_min_id AS
(
SELECT
id, min_id
FROM
tableA
JOIN username_to_min_id USING(username)
)
-- Use this conversion table to update tableB
UPDATE
tableB
SET
fk_id = min_id
FROM
id_to_min_id
WHERE
-- JOIN condition with table to update
id_to_min_id.id = tableB.fk_id
-- Take out the ones that won't change
AND (fk_id <> min_id)
RETURNING
* ;
The result you would get is:
+-------+----------+----+--------+
| fk_id | resource | id | min_id |
+-------+----------+----+--------+
| 1 | item1 | 2 | 1 |
| 1 | item9 | 3 | 1 |
| 4 | item8 | 5 | 4 |
+-------+----------+----+--------+
Shows you that three rows have been updated, that had fk_id = (2, 3, 5), and have now (1, 1, 4). (The id is the "old" fk_id value).
You can check it at http://rextester.com/EQPH47434
You can "squeeze everything" [change every virtual table name by its definition, and do a couple of SELECT optimizations] and get this equivalent query (probably less clear, yet totally equivalent):
UPDATE
tableB
SET
fk_id = min_id
FROM
tableA
JOIN
(
SELECT
min(id) AS min_id, username
FROM
tableA
GROUP BY
username
) AS username_to_min_id
USING (username)
WHERE
tableA.id = tableB.fk_id
AND (fk_id <> min_id)
RETURNING
* ;

SQL Server 2012 - Looking for duplicates with differences

In SQL Server 2012, I have a table like this:
Id | AccountID | Accession | Status
----------------------------------------
1 | 1234567 | ABCD | F
2 | 1234567 | ABCD | F
3 | 2345678 | BCDE | F
4 | 8765432 | BCDE | F
5 | 3456789 | CDEF | F
6 | 9876543 | CDEF | A
I need to find rows that have the same Accession and a Status of "F", but a different AccountID.
I need a query that would return:
Id | AccountID | Accession | Status
----------------------------------------
3 | 2345678 | BCDE | F
4 | 8765432 | BCDE | F
1 and 2 wouldn't be returned because they have the same AccountID. 5 and 6 wouldn't be returned because the status on 6 is "A" and not "F".
You could do something like this.
;WITH NonDupAccountIDs AS
(
SELECT AccountID,Accession, Status
FROM MyTable
WHERE Status = 'F'
GROUP BY AccountID,Accession, Status
HAVING COUNT(Id) = 1
)
,DupAccessions AS
(
SELECT Accession
FROM MyTable
WHERE Status = 'F'
GROUP BY Accession
HAVING COUNT(AccountID) > 1
)
select a.AccountID, a.Accession, a.Status
FROM NonDupAccountIDs a
INNER JOIN DupAccessions b
ON a.Accession = b.Accession
Another alternative
Declare #Table table (id int,AccountID varchar(25),Accession varchar(25),Status varchar(25))
Insert into #Table (id , AccountID , Accession , Status) values
(1, 1234567,'ABCD','F'),
(2, 1234567,'ABCD','F'),
(3, 2345678,'BCDE','F'),
(4, 8765432,'BCDE','F'),
(5, 3456789,'CDEF','F'),
(6, 9876543,'CDEF','A')
Select A.*
from #Table A
Join (
Select Accession
From #Table
Where Status='F'
Group By Accession
Having Min(Accession)=Max(Accession)
and count(Distinct AccountID)>1
) B on a.Accession=B.Accession
Returns
id AccountID Accession Status
3 2345678 BCDE F
4 8765432 BCDE F
This works as well. If there are multiple sets of duplicates, this only returns one with the highest ID. Example
John Cappelletti had a great solution as well, his returns all duplicated values if there exists any incongruity. Example
I had to add some more data to see what would happen. You should decide how you will treat these occurrences.
select
max(ID) ID,AccountID, Accession
from p where Status = 'F'
group by AccountID, Accession
having
(select count(Accession) from (select max(ID) ID,AccountID, Accession from p where Status = 'F' group by AccountID, Accession) f where f.accession = p.accession)>1
;
SELECT t2.Id, t1.AccountID, t1.Accession, t1.Status
FROM TABLE_NAME t2
INNER JOIN (
SELECT AccountID, Accession, Status
FROM TABLE_NAME
GROUP BY Status, Accession, AccountID
) t1
ON t1.AccountID = t2.AccountID
Might need to play with this but should get you close. Remember to replace TABLE_NAME with your table.

Resources