Merge data between two tables using split compare - sql-server

I have two tables and i need to compare data and update one table records. Please let me know how this can be done. This is the scenario
Proj1 Table
This is the first table where data needs to be synchronized
ID Text reqId
1 R1|R2 12
2 R2|R3 12
3 R3|R5|R2 12
Proj2 Table
This is the table where data updates are taking place
ID Text Active reqId
3 R1 1 12
4 R3 1 12
5 R4 1 12
I need to take each record from Proj1, use a split function then for each text in split, compare Text field between both these tables result should be similar to below. We are syncing data in Proj2 to similar to Proj1.
ID Text Active reqId
3 R1 1 12 (Ignore as it exists in both tables)
4 R3 1 12 (Ignore as it exists in both tables)
5 R4 0 12 (Update to inactive as it does not exist Proj1 table but exists in )
6 R2 1 12 (Insert as it does not exist in Proj2 table, insert only once)
7 R5 1 12 (Insert as it does not exist in Proj2 table, insert only once)

If you are using SQL Server 2008 or later, you could use a MERGE statement, like this:
/*
CREATE TABLE Proj1 (
ID INT PRIMARY KEY,
Text VARCHAR(100) NOT NULL,
reqId INT NOT NULL
)
INSERT INTO Proj1 VALUES (1,'R1|R2',12)
INSERT INTO Proj1 VALUES (2,'R2|R3',12)
INSERT INTO Proj1 VALUES (3,'R3|R5|R2',12)
CREATE TABLE Proj2 (
ID INT IDENTITY PRIMARY KEY,
Text VARCHAR(100) NOT NULL,
Active BIT NOT NULL,
reqId INT NOT NULL
)
SET IDENTITY_INSERT Proj2 ON
INSERT INTO Proj2 (ID, Text, Active, reqId) VALUES (3,'R1',1,12)
INSERT INTO Proj2 (ID, Text, Active, reqId) VALUES (4,'R3',1,12)
INSERT INTO Proj2 (ID, Text, Active, reqId) VALUES (5,'R4',1,12)
SET IDENTITY_INSERT Proj2 OFF
*/
GO
CREATE FUNCTION dbo.Split(#String VARCHAR(1000),#Separator CHAR(1))
RETURNS #Result TABLE (
Position INT IDENTITY PRIMARY KEY,
Value VARCHAR(1000)
) AS
BEGIN
DECLARE #Pos INT, #Prev INT
SET #Prev=0
WHILE 1=1 BEGIN
SET #Pos=CHARINDEX(#Separator,#String,#Prev+1)
IF #Pos=0 BREAK
INSERT INTO #Result (Value) VALUES (SUBSTRING(#String,#Prev+1,#Pos-#Prev-1))
SET #Prev=#Pos
END
INSERT INTO #Result (Value) VALUES (SUBSTRING(#String,#Prev+1,LEN(#String)))
RETURN
END
GO
MERGE INTO dbo.Proj2 p2
USING (
SELECT DISTINCT reqId, Value FROM dbo.Proj1 p1
CROSS APPLY dbo.Split(Text,'|') s
) x
ON p2.Text=x.Value AND p2.reqId=x.reqId
WHEN NOT MATCHED THEN INSERT VALUES (Value,1,reqid)
WHEN NOT MATCHED BY SOURCE THEN UPDATE SET Active=0
WHEN MATCHED AND Active=0 THEN UPDATE SET Active=1;
SELECT * FROM dbo.Proj2
Later edit: I added the third WHEN clause in the MERGE statement, to handle the case when the row is already present, but without the Active flag (although this case does not appear in the sample data).

You can also handle this without a MERGE statement, like this:
INSERT INTO Proj2
SELECT Value,1,reqid
FROM (
SELECT DISTINCT reqId, Value FROM dbo.Proj1 p1
CROSS APPLY dbo.Split(Text,'|') s
) x
WHERE NOT EXISTS (
SELECT *
FROM Proj2 p2
WHERE p2.Text=x.Value AND p2.reqId=x.reqId
)
UPDATE Proj2 SET Active=0
FROM Proj2 p2
WHERE NOT EXISTS (
SELECT *
FROM (
SELECT DISTINCT reqId, Value FROM dbo.Proj1 p1
CROSS APPLY dbo.Split(Text,'|') s
) x
WHERE p2.Text=x.Value AND p2.reqId=x.reqId
)
UPDATE Proj2 SET Active=1
FROM Proj2 p2
INNER JOIN (
SELECT DISTINCT reqId, Value FROM dbo.Proj1 p1
CROSS APPLY dbo.Split(Text,'|') s
) x ON p2.Text=x.Value AND p2.reqId=x.reqId
WHERE p2.Active=0
(I used the Split function mentioned in the other answer)

Related

MS SQL join/insert with non unique matching rows via script?

I need to to prove the existence of the amount of values from table1 in an MS SQL DB.
The table1 for proving has the following values:
MANDT DOKNR LFDNR
1 0020999956
1 0020999958
1 0020999960 2
1 0020999960 3
1 0020999960
1 0020999962
As you can see there are single rows and then there are special cases, where values are doubled with a running number (means it exists three times in the source), so all 2nd/3rd/further entries do get a increasing number in LFDNR.
The target table2 (where I need to proove for the amount/existance) has two columns with matching data:
DataID Facet
42101976 0020999956
42100240 0020999958
65688960 0020999960
65694287 0020999960
65697507 0020999960
42113401 0020999962
I would like to insert the DataID from 2nd table to the first table to have a 'proof', so to see if anything is missing from table2 and keep the table1 as proof.
I tried to uses joins and then I thought about a do while script running all rows down, but my knowledge stops creating scripts for this.
Edit:
Output should be then:
MANDT DOKNR LFDNR DataID
1 0020999956 42101976
1 0020999958 42100240
1 0020999960 2 65688960
1 0020999960 3 65694287
1 0020999960 65697507
1 0020999962 42113401
But it could be, for example, that a row in table 2 is missing, so a DataID would be empty then (and show that one is missing).
Any help appreciated!
You can use ROW_NUMBER to calculated [LFDNR] for each row in the second table, then to update the first table. If the [DataID] is null after the update, we have a mismatch.
DECLARE #table1 TABLE
(
[MANDT] INT
,[DOKNR] VARCHAR(32)
,[LFDNR] INT
,[DataID] INT
);
DECLARE #table2 TABLE
(
[DataID] INT
,[Facet] VARCHAR(32)
);
INSERT INTO #table1 ([MANDT], [DOKNR], [LFDNR])
VALUES (1, '0020999956', NULL)
,(1, '0020999958', NULL)
,(1, '0020999960', 2)
,(1, '0020999960', 3)
,(1, '0020999960', NULL)
,(1, '0020999962',NULL)
INSERT INTO #table2 ([DataID], [Facet])
VALUES (42101976, '0020999956')
,(42100240, '0020999958')
,(65688960, '0020999960')
,(65694287, '0020999960')
,(65697507, '0020999960')
,(42113401, '0020999962');
WITH DataSource ([DataID], [DOKNR], [LFDNR]) AS
(
SELECT *
,ROW_NUMBER() OVER (PARTITION BY [Facet] ORDER BY [DataID])
FROM #table2
)
UPDATE #table1
SET [DataID] = DS.[DataID]
FROM #table1 T
INNER JOIN DataSource DS
ON T.[DOKNR] = DS.[DOKNR]
AND ISNULL(T.[LFDNR], 1) = DS.[LFDNR];
SELECT *
FROM #table1;

How to get desired result in SQL Server

In my application there is a table to store text and another table to store it's respective images..
My table structure goes as follows (tbl_article):
article_id | Page_ID | article_Content
-----------+---------+-----------------
1 | 1 | hello world
2 | 1 | hello world 2
where article_id is the pk and auto incremented.
Now in my other table (tbl_img):
image_id| image_location|article_id | page_id
--------+---------------+-----------+---------
1 | imgae locat | 1 | 1
2 | image loc2 | 2 | 1
where image_id is the pk and auto incremented.
In both table I am inserting data through table valued parameter, and in second table article_id is referencing article_id of the first table.
To get auto incremented column value I am using output clause:
DECLARE #TableOfIdentities TABLE
(
IdentValue INT,
PageId INT
)
INSERT INTO tbl_article(page_id, article_content)
OUTPUT Inserted.article_id, #pageId INTO #TableOfIdentities (IdentValue, PageId)
SELECT page_id, slogan_body_header
FROM #dtPageSlogan
INSERT INTO tbl_img(page_id, image_location)
SELECT page_id, image_location
FROM #dtPageImageContent
But now I have to insert values from #TableOfIdentities into article_id of tbl_img - how to do that?
You need an additional column , a temporary article id generated from your code to link images and related articles properly. So you can use MERGE with OUTPUT, because with merge you can refer to columns from both the target and the source and build your TableOfIdentities tvp properly, then join it with dtPageImageContent to insert on tbl_img.
CREATE TABLE tbl_article (
article_id INT IDENTITY(1, 1) PRIMARY KEY
, Page_ID INT
, article_Content NVARCHAR(MAX)
);
CREATE TABLE tbl_img (
image_id INT IDENTITY(1, 1) PRIMARY KEY
, image_location VARCHAR(256)
, article_id INT
, Page_ID INT
);
DECLARE #TableOfIdentities TABLE
(
IdentValue INT,
PageId INT,
tmp_article_id INT
);
DECLARE #dtPageSlogan TABLE(
tmp_article_id INT -- generated in your code
, page_id INT
, slogan_body_header NVARCHAR(MAX)
);
DECLARE #dtPageImageContent TABLE (
page_id INT
, image_location VARCHAR(256)
, tmp_article_id INT -- needed to link each image to its article
)
-- create sample data
INSERT INTO #dtPageSlogan(tmp_article_id, page_id, slogan_body_header)
VALUES (10, 1, 'hello world');
INSERT INTO #dtPageSlogan(tmp_article_id, page_id, slogan_body_header)
VALUES (20, 1, 'hello world 2');
INSERT INTO #dtPageImageContent(page_id, image_location, tmp_article_id)
VALUES (1, 'image loc1', 10);
INSERT INTO #dtPageImageContent(page_id, image_location, tmp_article_id)
VALUES (1, 'image loc2', 20);
-- use merge to insert tbl_article and populate #TableOfIdentities
MERGE INTO tbl_article
USING (
SELECT ps.page_id, ps.slogan_body_header, ps.tmp_article_id
FROM #dtPageSlogan as ps
) AS D
ON 1 = 2
WHEN NOT MATCHED THEN
INSERT(page_id, article_content) VALUES (page_id, slogan_body_header)
OUTPUT Inserted.article_id, Inserted.page_id, D.tmp_article_id
INTO #TableOfIdentities (IdentValue, PageId, tmp_article_id)
;
-- join using page_id and tmp_article_id fields
INSERT INTO tbl_img(page_id, image_location, article_id)
-- select the "IdentValue" from your table of identities
SELECT pic.page_id, pic.image_location, toi.IdentValue
FROM #dtPageImageContent pic
-- join the "table of identities" on the common "page_id" column
INNER JOIN #TableOfIdentities toi
ON pic.page_Id = toi.PageId AND pic.tmp_article_id = toi.tmp_article_id
;
You can try it on fiddle
You need to join the #dtPageImageContent table variable with the #TableOfIdentities table variable on their common page_id to get those values:
-- add the third column "article_id" to your list of insert columns
INSERT INTO tbl_img(page_id, image_location, article_id)
-- select the "IdentValue" from your table of identities
SELECT pic.page_id, pic.image_location, toi.IdentValue
FROM #dtPageImageContent pic
-- join the "table of identities" on the common "page_id" column
INNER JOIN #TableOfIdentities toi ON pic.page_Id = toi.page_id

How to Insert a new record in the middle of any table

I want to insert a row into a SQL server table at a specific position. For example my table has 100 rows and also I have a field named LineNumber,I want to insert a new row after line number 9. But the ID column which is PK for the table already has a row with LineNumber 9. So now I need a new row with the line number 9 or 10 so that the ID field has to be updated automatically. How can I insert a row at this position so that all the rows after it shift to next position?
Don't modify the primary key, that is not a good way to modify the order of your output now that you have a new record you want to insert.
Add a new column on to the table to hold your order. Then you can copy the primary key values in to that column if that's your current order before making the required changes for the new row.
Sample that you should be able to copy and paste and run as is:
I've added orderid column, which you will need to do with default null values.
DECLARE #OrderTable AS TABLE
(
id INT ,
val VARCHAR(5) ,
orderid INT
)
INSERT INTO #OrderTable
( id, val, orderid )
VALUES ( 1, 'aaa', NULL )
,
( 2, 'bbb', NULL )
,
( 3, 'ddd', NULL )
SELECT *
FROM #OrderTable
-- Produces:
/*
id val orderid
1 aaa NULL
2 bbb NULL
3 ddd NULL
*/
-- Update the `orderid` column to your existing order:
UPDATE #OrderTable
SET orderid = id
SELECT *
FROM #OrderTable
-- Produces:
/*
id val orderid
1 aaa 1
2 bbb 2
3 ddd 3
*/
-- Then you want to add a new item to change the order:
DECLARE #newVal AS NVARCHAR(5) = 'ccc'
DECLARE #newValOrder AS INT = 3
-- Update the table to prepare for the new row:
UPDATE #OrderTable
SET orderid = orderid + 1
WHERE orderid >= 3
-- this inserts ID = 4, which is what your primary key would do by default
-- this is just an example with hard coded value
INSERT INTO #OrderTable
( id, val, orderid )
VALUES ( 4, #newVal, #newValOrder )
-- Select the data, using the new order column:
SELECT *
FROM #OrderTable
ORDER BY orderid
-- Produces:
/*
id val orderid
1 aaa 1
2 bbb 2
4 ccc 3
3 ddd 4
*/
What makes this difficult is that the column is a primary key. If you can interact with the database when no one else is, then you can do this:
Make the column no longer a primary key.
Run a command like this:
UPDATE MyTable
SET PrimaryColumnID = PrimaryColumnID + 1
WHERE PrimaryColumnID > 8
Insert the row with the appropriate PrimaryColumnID (9).
Restore the column to being the primary key.
Obviously, this probably wouldn't be good with a large table. You could create a new primary key column and switch it, then fix the values, then switch it back.
Two steps, first update LineNumber
UPDATE
table
SET
LineNumber = LineNumber + 1
WHERE
LineNumber>9
Then do your insert
INSERT INTO table
(LineNumber, ...) VALUES (10, .....)

loop through values and update after each one completes

I have the following code that i need to run for 350 locations it takes an hour to do 5 locations so I run 5 at a time by using where location_code in ('0001', '0002', '0003', '0005', '0006') I would like to create a temp table with 2 columns one location_id and the other completed and loop through each value on the location_id column then update the completed column with date and time stamp when complete and commit after each. This way I can just let it run and if i need to kill it i can see the last completed location_id and know where to restart the process from or better yet have it check for a vaule in the completed column and if exists go to the next .....
--Collecting all records containing remnant cost. You will need to specify the location number(s). In the example below we're using location 0035
select sku_id, ib.location_id, price_status_id, inventory_status_id, sum(transaction_units) as units, sum(transaction_cost) as cost,
sum(transaction_valuation_retail) as val_retail, sum(transaction_selling_retail) as sell_retail
into #remnant_cost
from ib_inventory ib
inner join location l on l.location_id = ib.location_id
where location_code in ('0001', '0002', '0003', '0005', '0006')
group by sku_id, ib.location_id, price_status_id, inventory_status_id
having sum(transaction_units) = 0
and sum(transaction_cost) <> 0
--Verify the total remnant cost.
select location_id, sum(units) as units, sum(cost) as cost, sum(val_retail) as val_retail, sum(sell_retail) as sell_retail
from #remnant_cost
group by location_id
select *
from #remnant_cost
----------------------------------------------------Run above this line first and gather results--------------------------------
--inserting into a temp table the cost negation using transaction_type_code 500 (Actual shrink) before inserting into ib_inventory
--corrected query adding transaction date as column heading (Marshall)
select
sku_id, location_id, price_status_id, convert(smalldatetime,convert(varchar(50),getdate(),101)) as transaction_date, 500 as transaction_type_code, inventory_status_id, NULL as other_location_id,
NULL as transaction_reason_id, 999999 as document_number, 0 as transaction_units, cost * -1 as transaction_cost, 0 as transaction_valuation_retail,
0 as transaction_selling_retail,NULL as price_change_type, NULL as units_affected
into #rem_fix
from #remnant_cost
--Validating to make sure cost will have the exact opposite to negate.
select location_id, sum(transaction_units) as units, sum(transaction_cost) as cost, sum(transaction_valuation_retail) as val_retail,
sum(transaction_selling_retail) as sell_retail
from #rem_fix
group by location_id
BEGIN TRAN
EXEC inventory_update_$sp 'SELECT sku_id,location_id,price_status_id,transaction_date,transaction_type_code,inventory_status_id,other_location_id,
transaction_reason_id,document_number,transaction_units,transaction_cost,transaction_valuation_retail,transaction_selling_retail,price_change_type,
units_affected FROM #rem_fix'
COMMIT
Making some assumptions about your schema:
-- A working table to track progress that will stick around.
create table dbo.Location_Codes
( Location_Code VarChar(4), Started DateTime NULL, Completed DateTime NULL );
Then break up the work this way:
if not exists ( select 42 from dbo.Location_Codes where Completed is NULL )
begin
-- All of the locations have been processed (or this is the first time through).
delete from dbo.Location_Codes;
-- Get all of the location codes.
insert into dbo.Location_Codes
select Location_Code, NULL, NULL
from Location;
end
-- Temporary table to make things easier.
declare #Pass_Location_Codes as Table ( Location_Code VarChar(4) );
-- Loop until all locations have been processed.
while exists ( select 42 from dbo.Location_Codes where Completed is NULL )
begin
-- Get the next five locations for which processing has not completed.
delete from #Pass_Location_Codes;
insert into #Pass_Location_Codes
select top 5 Location_Code
from dbo.Location_Codes
where Completed is NULL
order by Location_Code;
-- Record the start date/time.
update dbo.Location_Codes
set Started = GetDate()
where Location_Code in ( select Location_Code from #Pass_Location_Codes );
-- Run the big query.
select ...
where Location_Code in ( select Location_Code from #Pass_Location_Codes )
...
-- Record the completion date/time.
update dbo.Location_Codes
set Completed = GetDate()
where Location_Code in ( select Location_Code from #Pass_Location_Codes );
end

Not allowing column values other than what is found in other table

I have 2 tables
Table A
Column A1 Column A2 and
Table B
Column B1 Column B2
Column A1 is not unique and not the PK, but I want to put a constraint on column B1 that it cannot have values other than what is found in Column A1, can it be done?
It cannot be done using FK. Instead you can use a check constraint to see if B value is available in A.
Example:
alter table TableB add constraint CK_BValueCheck check dbo.fn_ValidateBValue(B1) = 1
create function dbo.fn_ValidateBValue(B1 int)
returns bit as
begin
declare #ValueExists bit
select #ValueExists = 0
if exists (select 1 from TableA where A1 = B1)
select #ValueExists = 1
return #ValueExists
end
You can not have dynamic constraint to limit the values in Table B. Instead you can either have trigger on TableB or you need to limit all inserts or updates on TbaleB to select values from Column A only:
Insert into TableB
Select Col from Table where Col in(Select ColumnA from TableA)
or
Update TableB
Set ColumnB= <somevalue>
where <somevalue> in(Select columnA from TableA)
Also, I would add its a very design practice and can not guarantee accuracy all the time.
Long way around but you could add an identity to A and declare the PK as iden, A1.
In B iden would just be an integer (not identity).
You asked for any other ways.
Could create a 3rd table that is a FK used by both but that does not assure B1 is in A.
Here's the design I'd go with, if I'm free to create tables and triggers in the database, and still want TableA to allow multiple A1 values. I'd introduce a new table:
create table TableA (ID int not null,A1 int not null)
go
create table UniqueAs (
A1 int not null primary key,
Cnt int not null
)
go
create trigger T_TableA_MaintainAs
on TableA
after insert, update, delete
as
set nocount on
;With UniqueCounts as (
select A1,COUNT(*) as Cnt from inserted group by A1
union all
select A1,COUNT(*) * -1 from deleted group by A1
), CombinedCounts as (
select A1,SUM(Cnt) as Cnt from UniqueCounts group by A1
)
merge into UniqueAs a
using CombinedCounts cc
on
a.A1 = cc.A1
when matched and a.Cnt = -cc.Cnt then delete
when matched then update set Cnt = a.Cnt + cc.Cnt
when not matched then insert (A1,Cnt) values (cc.A1,cc.Cnt);
And test it out:
insert into TableA (ID,A1) values (1,1),(2,1),(3,2)
go
update TableA set A1 = 2 where ID = 1
go
delete from TableA where ID = 2
go
select * from UniqueAs
Result:
A1 Cnt
----------- -----------
2 2
Now we can use a genuine foreign key from TableB to UniqueAs. This should all be relatively efficient - the usual FK mechanisms are available between TableB and UniqueAs, and the maintenance of this table is always by PK reference - and we don't have to needlessly rescan all of TableA - we just use the trigger pseudo-tables.

Resources