SQL Merge Duplicate Values in Table and Related Mapping Table - sql-server

I have two tables. One is the parent data table, the other is a mapping table for fulfilling a many-to-many relationship between this parent data table and the main table. My problem is that the parent and mapping tables have duplicate values that need to be merged. I can seemingly remove the duplicates from the parent table, but the mapping table needs to have the duplicate data merged in the same fashion. There is a FK and related cascading delete/update on the Mapping Table. How do I ensure the merges from the following statement also get reflected in the Mapping Table?
Before
Parent Table_A:
| ID | ProductName | MFG_ID |
|------+-------------+------------+
| 1 | ACME_123 | 123 |
| 2 | ACME_123 | 456 |
Mapping Table
| ID | MainRecordID | ParentTable.MFG_ID|
|------+--------------+-----------------------+
| 1 | 1 | 123 |
| 2 | 2 | 456 |
Desired After
Parent Table_A:
| ID | ProductName | MFG_ID|
|------+-------------+------------+
| 1 | ACME_123 | 123 |
Mapping Table
| ID | MainRecordID | ParentTable.MFG_ID|
|------+--------------+-----------------------+
| 1 | 1 | 123 |
| 2 | 2 | 123 |
Proposed Code to Merge Table_A Duplicates
MERGE Table_A
USING
(
SELECT
MIN(ID) ID,
ProductName,
MIN(MFG_ID) MFG_ID,
FROM Table_A
GROUP BY ProductName
) NewData ON Table_A.ID = NewData.ID
WHEN MATCHED THEN
UPDATE SET
Table_A.ProductName = NewData.ProductName
WHEN NOT MATCHED BY SOURCE THEN DELETE;

Split it into two separate statements wrapped in an explicit transaction instead of a merge. Something like this:
declare #src table
(
Id int,
ProductName varchar(128),
MFG_ID int
)
set xact_abort on
insert into #src
select
Id = min(ID),
ProductName = ProductName,
MFG_ID = MIN(MFG_ID) ,
from Table_A
group by ProductName
begin tran
delete o
from Table_A o
where not exists
(
select 1
from #src i
where o.id = i.id
)
update t
set ProductName = s.ProductName
from Table_A t
inner join #Src s
on t.Id = s.Id
commit tran

Related

How to INSERT rows based on other rows?

I need to run a query that will INSERT new rows into a SQL Server join table.
Suppose I have the following tables to describe which products a store sells and in which states:
products:
+------------+--------------+
| product_id | product_name |
+------------+--------------+
| 1 | Laptop |
| 2 | Aspirin |
| 3 | Mattress |
+------------+--------------+
stores:
+----------+------------+
| store_id | store_name |
+----------+------------+
| 1 | Walmart |
| 2 | Best Buy |
| 3 | Sam's Club |
+----------+------------+
products_stores_states:
+------------+----------+-------+
| product_id | store_id | state |
+------------+----------+-------+
| 1 | 2 | AL |
| 1 | 2 | AR |
| 2 | 2 | AL |
| 2 | 2 | AR |
| 3 | 2 | AL |
| 3 | 2 | AR |
+------------+----------+-------+
So here we see that Best Buy sells all 3 products in AL and AR.
What I need to do is somehow insert rows into the products_stores_states table to add AZ for all products it currently sells.
With a small dataset, I could do this manually, row by row:
INSERT INTO products_stores_states (product_id, store_id, state) VALUES
(1,2,'AZ'),
(2,2,'AZ'),
(3,2,'AZ');
Since this is a large dataset, this is not really an option.
How would I go about inserting a new state for Best Buy for every product_id that the products_stores_states table already contains for Best Buy?
Bonus: If a query could be made to do this for multiple states that the same time, that would be even better.
Right now, I cannot wrap my head around how to do this, but I assume there would need to be a subquery to get the list of matching product_id values I need to use.
The following query will do what you want to do
DECLARE #temp TABLE (
state VARCHAR(20)
)
-- we are inserting state names into a temp table to use it further
INSERT INTO #temp (state)
VALUES
('AZ'),
('MA'),
('TX');
INSERT INTO products_stores_states(product_id, store_id, state )
SELECT
T.product_id,
T.store_id,
temp.state
FROM(
SELECT DISTINCT
Product_id, store_id
FROM
products_stores_states
WHERE store_id = 2 -- the store_id for which you want to make changes
) AS T
CROSS JOIN
#temp AS temp
At first, we are storing the state names into a table variable. Then we need to select only the distinct store_id and product_id combinations for a specific store.
Then we should insert the distinct values cross join with the table variable where we stored state names.
Here is the live demo.
Hope, this helps! Thanks.
If you are positive the inserted state is "NEW" to the table, something like this would work, changing the state variable to whatever you want to insert the new records.
DECLARE #State CHAR(2), #StoreId INT;
SET #State = 'AZ';
SET #StoreId = 2;
INSERT INTO products_stores_states (product_id, store_id, state)
SELECT DISTINCT product_id, #StoreId, #State
FROM dbo.products_stores_states
WHERE store_id = #StoreId;
You could first see what this statement would add using this:
DECLARE #State CHAR(2)
SET #State = 'AZ';
SET #StoreId = 2;
SELECT DISTINCT product_id, #StoreId, #State
FROM dbo.products_stores_states
WHERE store_id = #StoreId;

Delete Rows in table if condition is true

Aim: Using a Query - Delete all customers with no Assets
I have 2 tables. I want to delete all the rows where my count = 0:
My condition
if (SELECT COUNT(*) FROM Asset_Table WHERE Customer_ID='myString' ==0){
Delete row from Account_table
}
Accounts_Table :
CustomerID | Customer_Name | Onwner
-------------------------------------------------------------
123 | Jake | someowner1
124 | Ralph | someowner2
... | .... | ....
Asset_Table:
AssetINDEX | Customer_ID | Serial_ID |
-------------------------------------------------
5564 | 123 | xyz
5565 | 128 | xyz1
.... | ... | xyz2
Expected Result
Accounts_Table :
CustomerID | Customer_Name | Onwner
-------------------------------------------------------------
123 | Jake | someowner1
... | .... | ....
Asset_Table:
AssetINDEX | Customer_ID | Serial_ID |
-------------------------------------------------
5564 | 123 | xyz
.... | ... | xyz2
It appears in the "results" you gave in your question that you would like the data to be cleaned both ways. So that the only records remaining have references in both tables. To do this you need two statements and you can use the EXISTS clause.
DELETE FROM Accounts_Table
WHERE NOT EXISTS (SELECT CustomerID FROM Asset_Table WHERE Accounts_Table.CustomerID
= Asset_Table.Customer_ID)
DELETE FROM Asset_Table
WHERE NOT EXISTS (SELECT AssetINDEX FROM Accounts_Table where Asset_Table.Customer_ID
= Accounts_Table.CustomerID)
Delete all customers with no Assets
You can use exists to verify not existing rows in Asset_Table:
delete from Account_table AT
where not exists (select AssetINDEX from Asset_Table
where Customer_ID = AT.CustomerID)
Below query should work;
delete from Accounts_Table where CustomerID not in (select Customer_ID from Asset_Table )
DELETE C FROM
Account_table AS C
WHERE
NOT EXISTS (
SELECT
'the customer does not have any asset'
FROM
Asset_Table AS A
WHERE
A.Customer_ID = C.Customer_ID)

SQL Server - How to check if two items have the same relations to another set of items?

I have table with Employees (tblEmployee):
| ID | Name |
| 1 | Smith |
| 2 | Black |
| 3 | Thompson |
And a table with Roles (tblRoles):
| ID | Name |
| 1 | Submitter |
| 2 | Receiver |
| 3 | Analyzer |
I have also a table with relations of Employees to their Roles with many to many relation type (tblEmployeeRoleRel):
| EmployeeID | RoleID |
| 1 | 1 |
| 1 | 2 |
| 2 | 1 |
| 2 | 2 |
| 2 | 3 |
| 3 | 3 |
I need to select ID, Name from tblEmployee that have exaclty the same set of roles from tblEmployeeRoleRel as has the Employee with ID = 1. How can I do it?
Use a where clause to limit the roles you're looking at to those of employeeID of 1 and use a having clause to make sure that the employee's role count matches that of employee1.
SELECT A.EmployeeID
FROM tblEmployeeRoleRel A
WHERE Exists (SELECT 1
FROM tblEmployeeRoleRel B
WHERE B.EmployeeID = 1
and B.RoleID = A.RoleID)
GROUP BY A.EmployeeID
HAVING count(A.RoleID) = (SELECT count(C.RoleID)
FROM tblEmployeeRoleRel C
WHERE EmployeeID = 1)
This assumes that employeeID and roleID are unique in tblEmployeeRoleRel otherwise we may have to distinct the roleID fields above.
Declare #EmployeeID int = 1 -- change this to whatever employee ID you like, or perhaps you'd pass an Employee ID to it in a stored procedure.
Select Distinct e.EmployeeID -- normally distinct would incur extra overhead, but in this case you only want the employee IDs. not using Distinct when an employee has multiple roles will give you multiple employee IDs.
from tblEmployeeRoleRel as E
where E.EmployeeID not in
(Select EmployeeID from tblEmployeeRoleRel where RoleID not in (Select RoleID from tblEmployeeRoleRel where Employee_ID = #EmployeeID))
and exists (Select EmployeeID from tblEmployeeRoleRel where EmployeeID = e.EmployeeID) -- removes any "null" matches.
and E.Employee_ID <> #Employee_ID -- this keeps the employee ID itself from matching.

Troubleshooting to implement SQL Server trigger

I have this table called InspectionsReview:
CREATE TABLE InspectionsReview
(
ID int NOT NULL AUTO_INCREMENT,
InspectionItemId int,
SiteId int,
ObjectId int,
DateReview DATETIME,
PRIMARY KEY (ID)
);
Here how the table looks:
+----+------------------+--------+-----------+--------------+
| ID | InspectionItemId | SiteId | ObjectId | DateReview |
+----+------------------+--------+-----------+--------------+
| 1 | 3 | 3 | 3045 | 20-05-2016 |
| 2 | 5 | 45 | 3025 | 01-03-2016 |
| 3 | 4 | 63 | 3098 | 05-05-2016 |
| 4 | 5 | 5 | 3041 | 03-04-2016 |
| 5 | 3 | 97 | 3092 | 22-02-2016 |
| 6 | 1 | 22 | 3086 | 24-11-2016 |
| 7 | 9 | 24 | 3085 | 15-12-2016 |
+----+------------------+--------+-----------+--------------+
I need to write trigger that checks before the new row is inserted to the table if the table already has row with columns values 'ObjectId' and 'DateReview' that equal to the columns values of the row that have to be inserted, if it's equal I need to get the ID of the exited row and to put to trigger variable called duplicate .
For example, if new row that has to be inserted is:
INSERT INTO InspectionsReview (InspectionItemId, SiteId, ObjectId, DateReview)]
VALUES (4, 63, 3098, '05-05-2016');
The duplicate variable in SQL Server trigger must be equal to 3.
Because the row in InspectionsReview table were ID = 3 has ObjectId and DateReview values the same as in new row that have to be inserted. How can I implement this?
With the extra assumption that you want to log all the duplicate to a different table, then my solution would be to create an AFTER trigger that would check for the duplicate and insert it into your logging table.
Of course, whether this is the solution depends on whether my extra assumption is valid.
Here is my logging table.
CREATE TABLE dbo.InspectionsReviewLog (
ID int
, ObjectID int
, DateReview DATETIME
, duplicate int
);
Here is the trigger (pretty straightforward with the extra assumption)
CREATE TRIGGER tr_InspectionsReview
ON dbo.InspectionsReview
AFTER INSERT
AS
BEGIN
DECLARE #tableVar TABLE(
ID int
, ObjectID int
, DateReview DATETIME
);
INSERT INTO #tableVar (ID, ObjectID, DateReview)
SELECT DISTINCT inserted.ID, inserted.ObjectID, inserted.DateReview
FROM inserted
JOIN dbo.InspectionsReview ir ON inserted.ObjectID=ir.ObjectID AND inserted.DateReview=ir.DateReview AND inserted.ID <> ir.ID;
INSERT INTO dbo.InspectionsReviewLog (ID, ObjectID, DateReview, duplicate)
SELECT ID, ObjectID, DateReview, 3
FROM
#tableVar;
END;

Split table in two tables plus a link table

I have a table with three columns with double values, but no double rows. Now I want to split this table in two table with unique values and a link table. I think the Problem gets clearer when I Show you example tables:
Original:
| ID | Column_1 | Column_2 | Column_3 |
|----|----------|----------|----------|
| 1 | A | 123 | A1 |
| 2 | A | 123 | A2 |
| 3 | B | 234 | A2 |
| 4 | C | 456 | A1 |
Table_1
| ID | Column_1 | Column_2 |
|----|----------|----------|
| 1 | A | 123 |
| 2 | B | 234 |
| 3 | C | 456 |
Table_2
| ID | Column_3 |
|----|----------|
| 1 | A1 |
| 2 | A2 |
Link-Table
| ID | fk1 | fk2 |
|----|-----|-----|
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 2 |
| 4 | 3 | 1 |
Table_1 I created like this:
INSERT INTO Table_1(Column_1, Column_2)
SELECT DISTINCT Column_1, Column_2 FROM Original
WHERE Original.Column_1 NOT IN (SELECT Column_1 FROM Table_1)
Table_2 I created in the same way.
The question now is, how to create the Link-Table?
The original table does grow continuesly, so only new entries should be added.
Do I have to use a Cursor, or is there a better way?
SOLUTION:
MERGE Link_Table AS LT
USING (SELECT DISTINCT T1.ID AS T1ID, T2.ID AS T2ID FROM Original AS O
INNER JOIN Table_1 AS T1 ON T1.Column_1 = O.Column_1
INNER JOIN Table_2 AS T2 ON T2.Column_3 = O.Column_3) AS U
ON LT.fk1 = U.T1ID
WHEN NOT MATCHED THEN
INSERT (fk1, fk2)
VALUES (U.T1ID, U.T2ID);
You can JOIN all 3 tables to get proper data for link table:
--INSERT INTO [Link-Table]
SELECT t1.ID,
t2.ID
FROM Original o
INNER JOIN Table_1 t1
ON t1.Column_1 = o.Column_1
INNER JOIN Table_2 t2
ON t2.Column_3 = o.Column_3
If your original table will grow, then you need to use MERGE to update/insert new data.
You have to inner join your Original,Table_1 and Table_2 to get the desired result.
Try like this, Its similar to gofr1 post.
DECLARE #orginal TABLE (
ID INT
,Column_1 VARCHAR(10)
,Column_2 INT
,Column_3 VARCHAR(10)
)
DECLARE #Table_1 TABLE (
ID INT
,Column_1 VARCHAR(10)
,Column_2 INT
)
DECLARE #Table_2 TABLE (
ID INT
,Column_3 VARCHAR(10)
)
Insert into #orginal values
(1,'A',123,'A1')
,(2,'A',123,'A2')
,(3,'B',234,'A2')
,(4,'C',456,'A1')
Insert into #Table_1 values
(1,'A',123)
,(2,'B',234)
,(3,'C',456)
Insert into #Table_2 values
(1,'A1')
,(2,'A2')
SELECT O.ID
,T1.ID
,T2.ID
FROM #orginal O
INNER JOIN #Table_1 T1 ON T1.Column_1 = O.Column_1
INNER JOIN #Table_2 T2 ON T2.Column_3 = O.Column_3

Resources