How to perform dynamic update through MERGE statement in Snowflake? - snowflake-cloud-data-platform

I am performing update from JSON data using MERGE statement. The data contains primary key and the column that was updated from the source system. Since the data contains just the updated column along with primary key, update performed through MERGE is automatically updating other column too to null value. Is there any way through which we could build the update statement dynamically for every row and execute it through MERGE?
create or replace table source_data as
select parse_json(COLUMN1)::variant datacol
from values
('{
"metadata":{"OperationName":"UPDATE"},
"data":{"id":"1234","status":"Active"}
}'),
('{
"metadata":{"OperationName":"UPDATE"},
"data":{"id":"1235","name":"Johny"}
}');
create or replace table employee_destination as
select column1::text as id,
column2::text as name,
column3::text as status
from values
('1234','John','Inactive'),
('1235','Jack','Active');
MERGE into employee_destination as Target using (select datacol:data:id as id,datacol:data:status as status,datacol:data:name name,datacol:metadata:OperationName as operation_name from SOURCE_DATA)
AS Source
ON Target.id = Source.id
when matched AND Source.operation_name = 'UPDATE'
THEN
update set Target.id = Source.id, Target.name = Source.name, Target.status = Source.status;
Current output:
1234 null Active
1235 Johny null
Expected output:
1234 John Active
1235 Johny Inactive
thanks.

Try Below
AS Source
ON Target.id = Source.id
when matched AND Source.operation_name = 'UPDATE'
THEN
update set
--Target.id = Source.id, Why update your PK_ID
Target.name = COALESCE(Source.name,Target.name),
Target.status = COALESCE(Source.status,Target.status);

Related

Snowflake does not implement the full SQL MERGE statement?

I am trying to create a Snowflake task that executes a MERGE statement.
However, it seems that Snowflake does not recognize the “when not matched by target” or “when not matched by source” statements.
create or replace task MERGE_TEAM_TOUCHPOINT
warehouse = COMPUTE_WH
schedule = '1 minute'
when system$stream_has_data('TEAMTOUCHPOINT_CDC')
as
merge into dv.Team_Touchpoint as f using TeamTouchpoint_CDC as s
on s.uniqueid = f.uniqueid
when matched then
update set TEAMUNIQUEID = s.TEAMUNIQUEID,
TOUCHPOINTUNIQUEID = s.TOUCHPOINTUNIQUEID
when not matched by target then
insert (
ID,
UniqueID,
TEAMUNIQUEID,
TOUCHPOINTUNIQUEID
)
values (
s.ID,
s.UniqueID,
s.TEAMUNIQUEID,
s.TOUCHPOINTUNIQUEID
)
when not matched by source then delete;
How can I do this? Is there really no other way than creating a stored procedure in javascript to first truncate the table and then insert everything from the staging table?
A workaround suggested by a teammate:
Define MATCHED_BY_SOURCE based on a full join, then look if a.col or b.col are null:
merge into TARGET t
using (
select <COLUMN_LIST>,
iff(a.COL is null, 'NOT_MATCHED_BY_SOURCE', 'MATCHED_BY_SOURCE') SOURCE_MATCH,
iff(b.COL is null, 'NOT_MATCHED_BY_TARGET', 'MATCHED_BY_TARGET') TARGET_MATCH
from SOURCE a
full join TARGET b
on a.COL = b.COL
) s
on s.COL = t.COL
when matched and s.SOURCE_MATCH = 'NOT_MATCHED_BY_SOURCE' then
<DO_SOMETHING>
when matched and s.TARGET_MATCH = 'NOT_MATCHED_BY_TARGET' then
<DO_SOMETHING_ELSE>
;
(same as in https://stackoverflow.com/a/69095225/132438)
Neither 'by target' nor 'by source' are valid keywords within the MERGE command of Snowflake and the Matching is meant to be 'by target' only (https://docs.snowflake.com/en/sql-reference/sql/merge.html). To achieve your goal you need to run the DELETE separately from the MERGE - in which you will be able to run the UPDATE (when MATCHED) and the INSERT (when NOT MATCHED "by target"), as in fact the DELETE can be handled by the MERGE only WHEN MATCHED "by target".
You could handle the two steps (1.DELETE; 2.MERGE-UPDATE&INSERT) within a single explicit transaction in a Stored Procedure, or two different transactions via two separate Tasks, one of which being an AFTER Task.
Alternatively, you can run an INSERT with the optional parameter OVERWRITE which will run a TRUNCATE of the target table and a subsequent loading from the source table, all in a single transaction:
https://docs.snowflake.com/en/sql-reference/sql/insert.html#optional-parameters
Here is a reproducible example of the DELETE + MERGE(UPDATE&INSERT) approach:
USE DEV;
CREATE OR REPLACE TEMPORARY TABLE Public.My_Merge_Target (
Id INTEGER, Name VARCHAR
)
AS
SELECT column1, column2
FROM (VALUES (1, 'Stay as is'), (2, 'This name has to change'), (3, 'This needs to go'));
CREATE OR REPLACE TEMPORARY TABLE Public.My_Merge_Source (
Id INTEGER, Name VARCHAR
)
AS
SELECT column1, column2
FROM (VALUES (1, 'Stay as is'), (2, 'This is the new name for id=2'), (4, 'A new row'));
SELECT * FROM Public.My_Merge_Target ORDER BY Id;
/*
------------------------------------
Id | Name
------------------------------------
1 | Stay as is
2 | This name has to change
3 | This needs to go
*/
SELECT * FROM Public.My_Merge_Source ORDER BY Id;
/*
------------------------------------
Id | Name
------------------------------------
1 | Stay as is
2 | This is the new name for id=2
4 | A new row
*/
DELETE FROM Public.My_Merge_Target AS trg
USING (
SELECT t.Id FROM Public.My_Merge_Source AS s
RIGHT JOIN Public.My_Merge_Target AS t
ON s.Id = t.Id
WHERE s.Id IS NULL
) AS src
WHERE trg.Id = src.Id;
/*
-----------------------
number of rows deleted
-----------------------
1
-----------------------
*/
SELECT * FROM Public.My_Merge_Target ORDER BY Id;
/*
------------------------------------
Id | Name
------------------------------------
1 | Stay as is
2 | This is the new name
*/
MERGE
INTO Public.My_Merge_Target AS trg
USING (
SELECT Id, Name
FROM Public.My_Merge_Source
) AS src
ON
trg.Id = src.Id
WHEN
MATCHED
AND (src.Name != trg.Name) THEN UPDATE
SET Name = src.Name
WHEN
NOT MATCHED THEN INSERT (Id, Name)
VALUES (src.Id, src.Name)
;
/*
-------------------------------------------------
number of rows inserted | number of rows updated
-------------------------------------------------
1 | 1
-------------------------------------------------
*/
SELECT * FROM Public.My_Merge_Target ORDER BY Id;
/*
------------------------------------
Id | Name
------------------------------------
1 | Stay as is
2 | This is the new name for id=2
4 | A new row
*/

I am unable to do insert into a table along with Merge command

Unable to perform Merge with Insert statement (for an accounting process)
Table1 contains GRList for write off (based on date on Table3)
Table2 Contains All GR details (all information from 1-jan-2010 to till date)
Table3 contains Oldest Claim date (eg: 1-Apr-2018)
so from the above scenario Oldest Claim date (eg: 1-Apr-2018) gets picked up from Table 3 and then it searches in Table 2 for the GRs which is before the extracted date ( <= 1-apr-2018) and populates the record (from 1-jan-2010 to 31-mar-2018) in Table 1
Code Tried in SQL
MERGE Table1 As Target
Using (select column1, column2 From Table2 AS tbl2 INNER JOIN Table3 as tbl3
ON tbl2.column1 = tbl3.column1
WHERE
tbl2.column1 = tbl3.column1 AND tbl2.column2 = tbl3.column2) AS SOURCE
ON
(Target.Column1 = Source.Column1 AND Target.Column2 = Source.Column2 AND Target.Column5 <= Source.Column5 )
WHEN MATCHED AND
Target.Column1 = Source.Column1 AND Target.Column2 = Source.Column2
THEN UPDATE SET Target.Column4='Updated'
WHEN NOT MATCHED BY TARGET
THEN INSERT
(Column1, Column2, Column3)
VALUES
(Source.Column1, Source.Column2, Source.Column3)
ERROR
Msg 248, Level 16, State 1, Line 23
The conversion of the nvarchar value '3000143371 ' overflowed an int column.
The statement has been terminated.
One of your Target table column data type is INT. You are trying to insert a value that is to big for data type INT
Try changing your target column data type from INT to BIGINT.

Incremental load in T-SQL with recorded history

Please help me, I need do a incremental process to my dimensions, to store history data too by T-SQL. I am trying use the MERGE statement, but it doesn't work, because this process deletes data that exists in the target but not in the source table.
Does someone have a suggestion ?
For exemple I have the source table: The source table is my STAGE,
Cod Descript State
AAA Desc1 MI
BBB Desc 2 TX
CCC Desc 3 MA
In the first load my dimension will be equal STAGE
However I can change the value in source table for exemple
AAA CHANGEDESCRIPTION Mi
So, I need update my dimension like this:
Cod Descript State
AAA Desc1 Mi before
AAA CHANGEDESCRIPTION MI actual
BBB Desc 2 TX actual
CCC Desc 3 MA actual
This is my DW and I need the information actual and all history
Try this. Column Aging is always "0" for current record and indicates change generation:
SELECT * INTO tbl_Target FROM (VALUES
('AAA','Desc1','MI',0),('BBB','Desc 2','TX',0),('CCC','Desc 3','MA',0)) as X(Cod, Descript, State, Aging);
GO
SELECT * INTO tbl_Staging FROM (VALUES ('AAA','Desc4','MI')) as X(Cod, Descript, State);
GO
UPDATE t SET Aging += 1
FROM tbl_Target as t
INNER JOIN tbl_Staging as s on t.Cod = s.Cod;
GO
INSERT INTO tbl_Target(Cod, Descript, State, Aging)
SELECT Cod, Descript, State, 0
FROM tbl_Staging;
GO
SELECT * FROM tbl_Target;
Please note that if you have records in staging table, which are "unchanged", you'll get false changes. If so, you have to filter them out in both queries.
I just commented the clause DELETE...tell me what do you think please
MERGE DimTarget AS [Target] --— begin merge statements (merge statements end with a semi-colon)
USING TableSource AS [Source]
ON [Target].ID = [Source].ID AND [Target].[IsCurrentRow] = 1
WHEN MATCHED AND --— record exists but values are different
(
[Target].Dscript <> [Source].Descript
)
THEN UPDATE SET --— update records (Type 1 means record values are overwritten)
[Target].[IsCurrentRow] = 0
-- , [Target].[ValidTo] = GETDATE()
WHEN NOT MATCHED BY TARGET --— record does not exist
THEN INSERT --— insert record
(
Descritp
, [IsCurrentRow]
)
VALUES
(
Descript
, 1
)
--WHEN NOT MATCHED BY SOURCE --— record exists in target but not source
--THEN DELETE -- delete from target
OUTPUT $action AS Action, [Source].* --— output results

SQL Merge and output in the same table

I'm merging 2 tables and I want that if the cell is update the field would be marked as "updated" my code:
MERGE [ITWORKS].[dbo].[Testine2] te
USING [ITWORKS].[dbo].[Testinus] bo
ON te.itemid = bo.itemid
AND te.itemname <> bo.itemname
WHEN MATCHED THEN
UPDATE
SET te.itemname = bo.itemname
OUTPUT
$action
into [ITWORKS].[dbo].[Testine2] (busena);
SELECT * FROM [ITWORKS].[dbo].[Testine2];
Result I get:
Itemid Itemname Busena
100001 TEST NULL
NULL Null UPADTE
The result I want:
Itemid Itemname Busena
100001 TEST UPDATE
I want that if the cell is update the field would be marked as
"updated"
There is no reason to use output. Just set the column value in the update.
MERGE [ITWORKS].[dbo].[Testine2] te
USING [ITWORKS].[dbo].[Testinus] bo
ON te.itemid = bo.itemid
AND te.itemname <> bo.itemname
WHEN MATCHED THEN
UPDATE
SET te.itemname = bo.itemname,
te.Busena = 'UPDATE';
SELECT * FROM [ITWORKS].[dbo].[Testine2];

sql server trigger to update time field when there is real change in another field

I need a trigger that updates a table row field if one or more fields of that row is updated.
Suppose you have an Employees table that may look as follows:
EmployeeId Name Address ModificationDate
1 Spears 27 Sober Road
2 Jagger 65 Straight Street
If there is a real change in the value of any field except the EmployeeId and ModificationDate fields, the trigger should generate a time value and update the ModificationDate.
Example 1 of a real change:
update dbo.Employees
set Name = 'Beggar'
where EmployeeId = 2
Example 2 of no real change:
update dbo.Employees
set Name = 'Jagger'
where EmployeeId = 2
If an update in Example 2 executes, the trigger should not update the ModificationDate field.
In a trigger, you have access to the 'inserted' and 'deleted' system tables.
Those tables contains the records in the table that have been updated by the statement that caused the trigger to execute.
For an 'UPDATE' trigger, the 'inserted' table contains the records like they are in the new state, the 'deleted' table contains the records with the old values.
You'll have to make use of those 2 tables to find out which records have really changed, and update the ModificationDate for those records.
I think the statement inside the trigger will look something like this. (I haven't tested it)
UPDATE myTable
SET ModificationDate = getdate()
FROM inserted, deleted
WHERE inserted.EmployeeId = deleted.EmployeeId
AND (inserted.Name <> deleted.Name OR inserted.Address <> deleted.Address)
Edit:
I've played around a bit:
create trigger upd_employee on [employee] after update
as
begin
update employee
set modifdate = getdate()
where employee.empid in
( select i.empid
from inserted i
inner join deleted d on i.empid = d.empid
where (i.name <> d.name or i.address <> d.address )
)
end
insert into employee
values
(1, 'Frederik' , '', null)
insert into employee
values
(2, 'User', '', null)
update employee
set [address] = 'some address'
select * from employee
update employee set [name] = 'test' where empid = 2
select * from employee
update employee set [name] = 'test' where empid = 2
select * from employee

Resources