Snowflake does not implement the full SQL MERGE statement? - snowflake-cloud-data-platform

I am trying to create a Snowflake task that executes a MERGE statement.
However, it seems that Snowflake does not recognize the “when not matched by target” or “when not matched by source” statements.
create or replace task MERGE_TEAM_TOUCHPOINT
warehouse = COMPUTE_WH
schedule = '1 minute'
when system$stream_has_data('TEAMTOUCHPOINT_CDC')
as
merge into dv.Team_Touchpoint as f using TeamTouchpoint_CDC as s
on s.uniqueid = f.uniqueid
when matched then
update set TEAMUNIQUEID = s.TEAMUNIQUEID,
TOUCHPOINTUNIQUEID = s.TOUCHPOINTUNIQUEID
when not matched by target then
insert (
ID,
UniqueID,
TEAMUNIQUEID,
TOUCHPOINTUNIQUEID
)
values (
s.ID,
s.UniqueID,
s.TEAMUNIQUEID,
s.TOUCHPOINTUNIQUEID
)
when not matched by source then delete;
How can I do this? Is there really no other way than creating a stored procedure in javascript to first truncate the table and then insert everything from the staging table?

A workaround suggested by a teammate:
Define MATCHED_BY_SOURCE based on a full join, then look if a.col or b.col are null:
merge into TARGET t
using (
select <COLUMN_LIST>,
iff(a.COL is null, 'NOT_MATCHED_BY_SOURCE', 'MATCHED_BY_SOURCE') SOURCE_MATCH,
iff(b.COL is null, 'NOT_MATCHED_BY_TARGET', 'MATCHED_BY_TARGET') TARGET_MATCH
from SOURCE a
full join TARGET b
on a.COL = b.COL
) s
on s.COL = t.COL
when matched and s.SOURCE_MATCH = 'NOT_MATCHED_BY_SOURCE' then
<DO_SOMETHING>
when matched and s.TARGET_MATCH = 'NOT_MATCHED_BY_TARGET' then
<DO_SOMETHING_ELSE>
;
(same as in https://stackoverflow.com/a/69095225/132438)

Neither 'by target' nor 'by source' are valid keywords within the MERGE command of Snowflake and the Matching is meant to be 'by target' only (https://docs.snowflake.com/en/sql-reference/sql/merge.html). To achieve your goal you need to run the DELETE separately from the MERGE - in which you will be able to run the UPDATE (when MATCHED) and the INSERT (when NOT MATCHED "by target"), as in fact the DELETE can be handled by the MERGE only WHEN MATCHED "by target".
You could handle the two steps (1.DELETE; 2.MERGE-UPDATE&INSERT) within a single explicit transaction in a Stored Procedure, or two different transactions via two separate Tasks, one of which being an AFTER Task.
Alternatively, you can run an INSERT with the optional parameter OVERWRITE which will run a TRUNCATE of the target table and a subsequent loading from the source table, all in a single transaction:
https://docs.snowflake.com/en/sql-reference/sql/insert.html#optional-parameters
Here is a reproducible example of the DELETE + MERGE(UPDATE&INSERT) approach:
USE DEV;
CREATE OR REPLACE TEMPORARY TABLE Public.My_Merge_Target (
Id INTEGER, Name VARCHAR
)
AS
SELECT column1, column2
FROM (VALUES (1, 'Stay as is'), (2, 'This name has to change'), (3, 'This needs to go'));
CREATE OR REPLACE TEMPORARY TABLE Public.My_Merge_Source (
Id INTEGER, Name VARCHAR
)
AS
SELECT column1, column2
FROM (VALUES (1, 'Stay as is'), (2, 'This is the new name for id=2'), (4, 'A new row'));
SELECT * FROM Public.My_Merge_Target ORDER BY Id;
/*
------------------------------------
Id | Name
------------------------------------
1 | Stay as is
2 | This name has to change
3 | This needs to go
*/
SELECT * FROM Public.My_Merge_Source ORDER BY Id;
/*
------------------------------------
Id | Name
------------------------------------
1 | Stay as is
2 | This is the new name for id=2
4 | A new row
*/
DELETE FROM Public.My_Merge_Target AS trg
USING (
SELECT t.Id FROM Public.My_Merge_Source AS s
RIGHT JOIN Public.My_Merge_Target AS t
ON s.Id = t.Id
WHERE s.Id IS NULL
) AS src
WHERE trg.Id = src.Id;
/*
-----------------------
number of rows deleted
-----------------------
1
-----------------------
*/
SELECT * FROM Public.My_Merge_Target ORDER BY Id;
/*
------------------------------------
Id | Name
------------------------------------
1 | Stay as is
2 | This is the new name
*/
MERGE
INTO Public.My_Merge_Target AS trg
USING (
SELECT Id, Name
FROM Public.My_Merge_Source
) AS src
ON
trg.Id = src.Id
WHEN
MATCHED
AND (src.Name != trg.Name) THEN UPDATE
SET Name = src.Name
WHEN
NOT MATCHED THEN INSERT (Id, Name)
VALUES (src.Id, src.Name)
;
/*
-------------------------------------------------
number of rows inserted | number of rows updated
-------------------------------------------------
1 | 1
-------------------------------------------------
*/
SELECT * FROM Public.My_Merge_Target ORDER BY Id;
/*
------------------------------------
Id | Name
------------------------------------
1 | Stay as is
2 | This is the new name for id=2
4 | A new row
*/

Related

Merge Statement Error debugging

I have a situation where I have a store procedure that contains a merge statement. The procedure grabs data from table A and uses the merge statement to update certain records in table B. The procedure works fine but occasionally, there are instances where there are duplicate records in table A. The store procedure is in a package with error notification set up and the package runs in a job and it gives the error below. Are there any ways that we can debug this? Like some where in the store procedure say if it gives an error then insert the source data into a table? Any input is appreciated.
Thanks
Error :
failed with the following error: "The MERGE statement attempted to
UPDATE or DELETE the same row more than once. This happens when a
target row matches more than one source row. A MERGE statement cannot
UPDATE/DELETE the same row of the target table multiple times. Refine
the ON clause to ensure a target row matches at most one source row,
or use the GROUP BY clause to group the source rows.". Possible
failure reasons: Problems with the query, "ResultSet" property not set
correctly, parameters not set correctly, or connection not established
correctly.
You could add something like this to the beginning of your merge procedure:
if exists (
select 1
from a
group by a.OnColumn
having count(*)>1
)
begin;
insert into merge_err (OnColumn, OtherCol, rn, cnt)
select
a.OnColumn
, a.OtherCol
, rn = row_number() over (
partition by OnColumn
order by OtherCol
)
, cnt = count(*) over (
partition by OnColumn
)
from a
raiserror( 'Duplicates in source table a', 0, 1)
return -1;
end;
test setup: http://rextester.com/EFZ77700
create table a (OnColumn int, OtherCol varchar(16))
insert into a values
(1,'a')
, (1,'b')
, (2,'c')
create table b (OnColumn int primary key, OtherCol varchar(16))
insert into b values
(1,'a')
, (2,'c')
create table merge_err (
id int not null identity(1,1) primary key clustered
, OnColumn int
, OtherCol varchar(16)
, rn int
, cnt int
, ErrorDate datetime2(7) not null default sysutcdatetime()
);
go
dummy procedure:
create procedure dbo.Merge_A_into_B as
begin
set nocount, xact_abort on;
if exists (
select 1
from a
group by a.OnColumn
having count(*)>1
)
begin;
insert into merge_err (OnColumn, OtherCol, rn, cnt)
select
a.OnColumn
, a.OtherCol
, rn = row_number() over (
partition by OnColumn
order by OtherCol
)
, cnt = count(*) over (
partition by OnColumn
)
from a
raiserror( 'Duplicates in source table a', 0, 1)
return -1;
end;
/*
merge into b
using a
on b.OnColumn = a.OnColumn
...
--*/
end;
go
execute test proc and check error table:
exec dbo.Merge_A_into_B
select *
from merge_err
where cnt > 1
results:
+----+----------+----------+----+-----+---------------------+
| id | OnColumn | OtherCol | rn | cnt | ErrorDate |
+----+----------+----------+----+-----+---------------------+
| 1 | 1 | a | 1 | 2 | 05.02.2017 17:22:39 |
| 2 | 1 | b | 2 | 2 | 05.02.2017 17:22:39 |
+----+----------+----------+----+-----+---------------------+

How to retrieve old values in OUTPUT clause with SQL MERGE statement

I'm using MERGE statement to update a product table containing (Name="a", Description="desca"). My source table contains (Name="a", Description="newdesca") and I merge on the Name field.
In my Output clause, I would like to get back the field BEFORE the update -> Description = "desca".
I couldn't find a way to do that, I'm always getting back the new value ("newdesca"). Why?
Can you not just used the deleted memory-resident table. e.g:
IF OBJECT_ID(N'tempdb..#T', 'U') IS NOT NULL
DROP TABLE #T;
CREATE TABLE #T (Name VARCHAR(5), Description VARCHAR(20));
INSERT #T (Name, Description)
VALUES ('a', 'desca'), ('b', 'delete');
MERGE #T AS t
USING (VALUES ('a', 'newdesca'), ('c', 'insert')) AS m (Name, Description)
ON t.Name = m.Name
WHEN MATCHED THEN
UPDATE SET Description = m.Description
WHEN NOT MATCHED BY TARGET THEN
INSERT (Name, Description)
VALUES (m.Name, m.Description)
WHEN NOT MATCHED BY SOURCE THEN
DELETE
OUTPUT $Action, inserted.*, deleted.*;
IF OBJECT_ID(N'tempdb..#T', 'U') IS NOT NULL
DROP TABLE #T;
The output of this would be:
$Action | Name | Description | Name | Description
--------+-------+-------------+------+--------------
INSERT | c | insert | NULL | NULL
UPDATE | a | newdesca | a | desca
DELETE | NULL | NULL | b | delete

How to replace SELECT statement inside IF statement for it to work [duplicate]

This question already has answers here:
Oracle: how to INSERT if a row doesn't exist
(9 answers)
Closed 9 years ago.
I have a simple question - for examples sake let's have the table
CITY(ID,Name).
An idea would be that when I want to add new city I first make sure it's not already in the table CITY.
Code example would be:
IF cityName NOT IN (SELECT name FROM city) THEN
INSERT INTO City(ID, NAME) VALUES(100, cityName);
ELSE
Raise namingError;
END IF;
However I can't have that subquery inside if statement so what should I replace it with? Any kind of list or variable or trick that I could use?
IF NOT EXISTS(SELECT 1 FROM CITY WHERE NAME = <CITYNAME>)
INSERT INTO City(ID, NAME) VALUES(100, cityName);
OR
INSERT INTO City
SELECT 100,'cityName'
FROM dual
WHERE NOT EXISTS (SELECT 1
FROM CITY
WHERE name = cityname
)
I read the second query here
I don't have a database to try this out, but this should work
You could use a merge command to perform the insert into the table. While the merge command is used to perform an insert if the data is not present or an update if the data is present in this case since you just have two fields it will just preform the insert for you.
This is useful if you want to take data from one or more tables and combine them into one.
MERGE INTO city c
USING (SELECT * FROM city_import ) h
ON (c.id = h.id and c.city = h.city)
WHEN MATCHED THEN
WHEN NOT MATCHED THEN
INSERT (id, city)
VALUES (h.id, h.city);
http://www.oracle-base.com/articles/9i/merge-statement.php
If it was me I'd probably do something like
DECLARE
rowCity CITY%ROWTYPE;
BEGIN
SELECT * INTO rowCity FROM CITY c WHERE c.NAME = cityName;
-- If we get here it means the city already exists; thus, we raise an exception
RAISE namingError;
EXCEPTION
WHEN NO_DATA_FOUND THEN
-- cityName not found in CITY; therefore we insert the necessary row
INSERT INTO City(ID, NAME) VALUES(100, cityName);
END;
Share and enjoy.
Two options:
One using INSERT INTO ... SELECT with a LEFT OUTER JOIN; and
The other using MERGE
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE city (
ID NUMBER(2) PRIMARY KEY,
NAME VARCHAR2(20)
);
INSERT INTO city
SELECT 1, 'City Name' FROM DUAL;
CREATE TABLE city_errors (
ID NUMBER(2),
NAME VARCHAR2(20),
TS TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
ERROR VARCHAR2(20)
);
Query 1:
DECLARE
city_id CITY.ID%TYPE := 2;
city_name CITY.NAME%TYPE := 'City Name';
namingError EXCEPTION;
PRAGMA EXCEPTION_INIT( namingError, -20001 );
BEGIN
INSERT INTO city ( id, name )
SELECT city_id,
city_name
FROM DUAL d
LEFT OUTER JOIN
city c
ON ( c.name = city_name )
WHERE c.id IS NULL;
IF SQL%ROWCOUNT = 0 THEN
RAISE namingError;
END IF;
EXCEPTION
WHEN DUP_VAL_ON_INDEX THEN
-- Do something when duplicate ID found.
INSERT INTO city_errors ( ID, NAME, ERROR ) VALUES ( city_id, city_name, 'Duplicate ID' );
WHEN namingError THEN
-- Do something when duplicate Name found.
INSERT INTO city_errors ( ID, NAME, ERROR ) VALUES ( city_id, city_name, 'Duplicate Name' );
END;
Results:
Query 2:
DECLARE
city_id CITY.ID%TYPE := 3;
city_name CITY.NAME%TYPE := 'City Name';
namingError EXCEPTION;
PRAGMA EXCEPTION_INIT( namingError, -20001 );
BEGIN
MERGE INTO city c
USING ( SELECT city_id AS id,
city_name AS name
FROM DUAL ) d
ON ( c.Name = d.Name )
WHEN NOT MATCHED THEN
INSERT VALUES ( d.id, d.name );
IF SQL%ROWCOUNT = 0 THEN
RAISE namingError;
END IF;
EXCEPTION
WHEN DUP_VAL_ON_INDEX THEN
-- Do something when duplicate ID found.
INSERT INTO city_errors ( ID, NAME, ERROR ) VALUES ( city_id, city_name, 'Duplicate ID' );
WHEN namingError THEN
-- Do something when duplicate Name found.
INSERT INTO city_errors ( ID, NAME, ERROR ) VALUES ( city_id, city_name, 'Duplicate Name' );
END;
Results:
Query 3:
SELECT * FROM City
Results:
| ID | NAME |
|----|-----------|
| 1 | City Name |
Query 4:
SELECT * FROM City_Errors
Results:
| ID | NAME | TS | ERROR |
|----|-----------|--------------------------------|----------------|
| 2 | City Name | January, 02 2014 20:01:49+0000 | Duplicate Name |
| 3 | City Name | January, 02 2014 20:01:49+0000 | Duplicate Name |

SQL Server: results of the table valued function doesn't match column names

I have this function:
CREATE FUNCTION [dbo].[full_ads](#date SMALLDATETIME)
returns TABLE
AS
RETURN
SELECT *,
COALESCE((SELECT TOP 1 ptype
FROM special_ads
WHERE [adid] = a.id
AND #date BETWEEN starts AND ends), 1) AS ptype,
(SELECT TOP 1 name
FROM cities
WHERE id = a.cid) AS city,
(SELECT TOP 1 name
FROM provinces
WHERE id = (SELECT pid
FROM cities
WHERE id = a.cid)) AS province,
(SELECT TOP 1 name
FROM models
WHERE id = a.mid) AS model,
(SELECT TOP 1 name
FROM car_names
WHERE id = (SELECT car_id
FROM models
WHERE id = a.mid)) AS brand,
(SELECT TOP 1 pid
FROM cities
WHERE id = a.cid) pid,
(SELECT TOP 1 car_id
FROM models
WHERE id = a.mid) bid,
(SELECT TOP 1 name
FROM colors
WHERE id = a.color_id) AS color,
COALESCE((SELECT TOP 1 fileid
FROM carimgs
WHERE adid = a.id), 'nocarimage.png') AS [image]
FROM ads a
WHERE isdeleted <> 1
Sometimes it works correctly, but sometimes column names doesn't match values like (I have written a sample results with fewer columns just to show the problem):
ID Name City Color Image
----------------------------------------------
1 John New York Null Red
2 Ted Chicago Null Blue
As you see color and Image values are shifted one column and this continues to the last column.
Can anyone tell me where the problem is?
This arises from using *.
If the definition of ads changes (columns added or removed) this can mess up the metadata associated with the TVF.
You would need to run sp_refreshsqlmodule on it to refresh this metadata after such changes. It is best to avoid * in view definitions or inline TVFs for this reason.
An example of this
CREATE TABLE T
(
A CHAR(1) CONSTRAINT DF_A DEFAULT 'A',
B CHAR(1) CONSTRAINT DF_B DEFAULT 'B',
C CHAR(1) CONSTRAINT DF_C DEFAULT 'C',
D CHAR(1) CONSTRAINT DF_D DEFAULT 'D'
)
GO
INSERT INTO T DEFAULT VALUES
GO
CREATE FUNCTION F()
RETURNS TABLE
AS
RETURN
SELECT * FROM T
GO
SELECT * FROM F()
GO
ALTER TABLE T DROP CONSTRAINT DF_C, COLUMN C
ALTER TABLE T ADD E CHAR(1) DEFAULT 'E' WITH VALUES
GO
SELECT * FROM F()
Returns
+---+---+---+---+
| A | B | C | D |
+---+---+---+---+
| A | B | D | E |
+---+---+---+---+
Note that the D and E values are shown in the wrong columns. It still shows column C even though it has been dropped.

A trigger to update one column value to equal the pkid of the record

I need to write a trigger that will set the value in column 2 = to the value in column 1 after a record has been created.
This is what I have so far:
create trigger update_docindex2_to_docid
ON dbo.TABLENAME
after insert
AS BEGIN
set DOCINDEX2 = DOCID
END;
I answered my own question one I sat and thought about it long enough....
This seems way to simple. I'm concerned that I'm going break something because I don't have a where condition that would identify the correct record. I want this to update docindex2 to the newly created DOCID after a record is created in the database. The docid is the pkid.
Any ideas/suggestions are appreciated....
Are you looking for something like this?
CREATE TABLE Table1 (docid INT IDENTITY PRIMARY KEY, docindex2 INT);
CREATE TRIGGER tg_mytrigger
ON Table1 AFTER INSERT
AS
UPDATE t
SET t.docindex2 = t.docid
FROM Table1 t JOIN INSERTED i
ON t.docid = i.docid;
INSERT INTO Table1 (docindex2) VALUES(0), (0);
Contents of Table after insert
| DOCID | DOCINDEX2 |
---------------------
| 1 | 1 |
| 2 | 2 |
Here is SQLFiddle demo

Resources