Snowflake time travel - snowflake-cloud-data-platform

I would need to create tables with use of snowflake time travel but not to drop tables and create new objects. I'm looking for most efficient way to do that.
Example:
CREATE OR REPLACE TABLE_SCHEMA.TABLE_NAME AS SELECT * FROM TABLE_SCHEMA.TABLE_NAME at(timestamp => '2020-11-01 07:00:00'::timestamp);
In snowflake documentation I found out that by running a CREATE OR REPLACE, the original table is dropped and replaced with a new table. There will be no time-travel data from before the replace table was issued. If I need to remove the data from a table and retain the data for time-travel purposes, I can use TRUNCATE TABLE statement. My question is shoud I use TRUNCATE or CREATE OR REPLACE? Thank you in advance.

The simplest approach is INSERT OVERWIRTE INTO:
INSERT OVERWRITE INTO TABLE_SCHEMA.TABLE_NAME
AS
SELECT *
FROM TABLE_SCHEMA.TABLE_NAME at(timestamp => '2020-11-01 07:00:00'::timestamp);
I would also like to know how overite effects time travel...historical data?
It is still possible to access data before INSERT OVERWRITE as long as it is within data retention period.
CREATE OR REPLACE TABLE TAB(id INT) AS SELECT 1 UNION SELECT 2;
-- original table
SELECT * FROM TAB;
-- 1
-- 2
UPDATE TAB SET id = id * 10;
SET queryupdate = LAST_QUERY_ID();
-- after update
SELECT * FROM TAB;
-- 10
-- 20
-- restoring state before update
INSERT OVERWRITE INTO TAB SELECT * FROM TAB BEFORE(STATEMENT => $queryupdate);
SET queryinsertoverwrite = LAST_QUERY_ID();
-- current state
SELECT * FROM TAB;
-- 1
-- 2
-- state before update
SELECT * FROM TAB BEFORE(STATEMENT => $queryupdate);
-- 1
-- 2
-- state before insert overwrite
SELECT * FROM TAB BEFORE(STATEMENT => $queryinsertoverwrite);
-- 10
-- 20

Related

Redo a create or replace in snowflake

I have the following problem:
I have used the function:
CREATE OR REPLACE TABLE myschema.public.table1 as (SELECT * FROM myschema.public.table1 BEFORE(OFFSET => -60*4*15) WHERE MARKET = 'ES'
)
I still had the filter MARKET = 'ES' in and now all the entries that are unequal MARKET = 'ES' are gone. Can I still undo this?
As you issued create or replace, the original table is dropped. So you need to rename your existing table and undrop the original table. Then you may re-run your timetravel command:
alter table table1 to table1_bak
undrop table table1;
CREATE OR REPLACE TABLE myschema.public.table1 as (SELECT * FROM myschema.public.table1 BEFORE(OFFSET => -60*4*15) )
https://docs.snowflake.com/en/sql-reference/sql/undrop-table.html

How to store existing column value in new table using trigger

I have two tables
Customer
CustomerUpdate
Structure of both tables are like this
Customer table's structure
CustomerName | CustomerId
CustomerUpdate table's structure
NewCustomerName | NewCustomerId | OldCustomerName
I have few values inserted in the Customer table. Whenever I should update the data in this table I want that the existing as well as new data should be triggered into new table CustomerUpdate.
For this I created a trigger but this is only pulling the updated data, it's not pulling the existing data..
CREATE TRIGGER trgAfterUpdate
ON [dbo].Customer
FOR UPDATE
AS
SET NOCOUNT ON
declare #NewCustomerName nchar(20);
declare #NewCustomerId nchar(20);
declare #OldCustomerName nchar(20);
declare #audit_action varchar(100);
select #NewCustomerName = i.CustomerName from inserted i;
select #NewCustomerId = i.CustomerId from inserted i;
select #OldCustomerName = c.CustomerName
from Customer c
where CustomerId = #NewCustomerId;
if update(CustomerName)
set #audit_action='Updated Record -- After Update Trigger.';
if update(CustomerId)
set #audit_action='Updated Record -- After Update Trigger.';
insert into CustomerUpdate(NewCustomerName, NewCustomerId, OldCustomername)
values(#NewCustomerName, #NewCustomerId, #OldCustomerName);
PRINT 'AFTER UPDATE Trigger fired.'
GO
Please help me out
First, selecting from the table being modified when an update trigger is executing will get the new value. These are AFTER triggers (rather than INSTEAD triggers) and therefore the update has already happened by the time the trigger fires (although it can be rolled back). If you need the old value, you should select from the DELETED pseudo-table.
Second, as pointed out by #marc_s in comments, your trigger has the hidden assumption that only one row is affected by each update. This may very well be a valid assumption for your environment, if your application only ever updates one row at a time, but in the general case, every trigger should be ready to handle the case where many rows are affected by a single update. Writing your triggers to handle multiple rows is good practice.
Third, all of your sequentially executing code is pretty much unnecessary. The old value and the new value can be retrieved and inserted all at once:
CREATE TRIGGER trgAfterUpdate
ON [dbo].Customer
FOR UPDATE
AS
BEGIN
SET NOCOUNT ON
insert into CustomerUpdate(NewCustomerName, NewCustomerId, OldCustomername)
-- case 1: ID unchanged
SELECT I.CustomerName, I.CustomerID, D.CustomerName
FROM Inserted I
JOIN Deleted D on I.CustomerID=D.CustomerID
UNION ALL
-- case 2: ID changed, Name unchanged
SELECT I.CustomerName, I.CustomerID, D.CustomerName
FROM Inserted I
JOIN Deleted D on I.CustomerName=D.CustomerName
WHERE I.CustomerID<>D.CustomerID
UNION ALL
--case 3: ID changed, Name changed
SELECT I.CustomerName, I.CustomerID, D.CustomerName
FROM Inserted I
LEFT JOIN Deleted D on I.CustomerID=D.CustomerID OR I.CustomerName=D.CustomerName
WHERE D.CustomerID IS NULL;
END

Returning a list of ids of deleted items

I am uncertain if this is not possible or if I am just unable to find the solution.
I am trying to write a SQL stored procedure that will delete a number of items and return the list of unique identifiers for the deleted items.
By using a temporary table I can add select all the items I want to delete, add the ids to a temp table, then delete all the items with an id in the temp table then return all the ids in the temp table.
I would like to avoid doing that, is there a better approach that will delete and return all the ids without the need for a temp table, and not making multiple calls to the db?
Any ideas welcome, and if there is a similar post please direct me. (I was unable to find one)
Below an example of what you want to achieve:
create table your_table
(
id int identity(1, 1) primary key,
value varchar(100)
);
insert into your_table (value) values
('hello'), ('from'), ('Mars'), ('!!!!');
create proc dbo.deleteByChar
(
#char char(1)
) as
begin
delete from your_table
output deleted.id --> OUTPUT clause as #Vishal_Gajjar suggested
where value like '%' + #char + '%';
end
Usage:
select * from your_table;
exec dbo.deleteByChar 'o';
select * from your_table;
Output:
id value
---------
1 hello
2 from
3 Mars
4 !!!!
id
--
1
2
id value
---------
3 Mars
4 !!!!
If you are using trigger, there is deleted virtual table where you can find the id-s. Note that in case of UPDATE the id-s will exist in deleted table again. To detect UPDATE you can join inserted table.

SQL Server select for update

I am struggling to find a SQL Server replacement for select for update that works.
I have a master table that contains a column which is used for next order number. The application does a select from update on this row, reads the current value (while locked) adds one to this value and then updates the row, then uses the number it received. This process works perfectly on all databases I've tried but for SQL Server which does not seem to have any process for selecting data for exclusive use.
How do I do a locked read and update of something like a next order number from a sequence table is SQL Server?
BTW, I know I can use things like IDENTITY cols and stuff, to do this, but in this case I must read from this existing column. Get the value and inc it, and do it in a safe locked manner to avoid 2 users getting the same value.
UPDATE::
Thank you, that works for this case :)
DECLARE #Output char(30)
UPDATE scheme.sysdirm
SET #Output = key_value = cast(key_value as int)+1
WHERE system_key='OPLASTORD'
SELECT #Output
I have one other place I do something similar. I read and lock a stock record too.
SELECT STOCK
FROM PRODUCT
WHERE ID = ? FOR UPDATE.
I then do some validation and the do
UPDATE PRODUCT SET STOCK = ?
WHERE ID=?
I can't just use your above method here, as the value I update is based on things I do from the stock I read. But I need to ensure no one else can mess with the stock while I do this. Again, easy on other DB's with SELECT FOR UPDATE... is there a SQL Server workaround?? :)
You can simple do an UPDATE that also reads out the new value into a SQL Server variable:
DECLARE #Output INT
UPDATE dbo.YourTable
SET #Output = YourColumn = YourColumn + 1
WHERE ID = ????
SELECT #Output
Since it's an atomic UPDATE statement, it's safe against concurrency issues (since only one connection can get an update locks at any one given time). A potential second session that wants to get the incremented value at the same time will have to wait until the first one completes, thus getting the next value from the table.
As an alternative you can use the OUTPUT clause of the UPDATE statement, although this will insert into a table variable.
Create table YourTable
(
ID int,
YourColumn int
)
GO
INSERT INTO YourTable VALUES (1, 1)
GO
DECLARE #Output TABLE
(
YourColumn int
)
UPDATE YourTable
SET YourColumn = YourColumn + 1
OUTPUT inserted.YourColumn INTO #Output
WHERE ID = 1
SELECT TOP 1 YourColumn
FROM #Output
**** EDIT
If you want to ensure that no-one can change the data after you have read it, you can use a repeatable read. You should be aware that any reads of any tables you do will be locked for Update (pessimistic locking) and may cause Deadlocking. You can also sue the SELECT ... FROM TABLE (UPDLOCK) hint within a transaction.
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRANSACTION
SELECT STOCK
FROM PRODUCT
WHERE ID = ?
.....
...
UPDATE Product
SET Stock = nnn
WHERE ID = ?
COMMIT TRANSACTION

SQL Server best way to calculate datediff between current row and next row?

I've got the following rough structure:
Object -> Object Revisions -> Data
The Data can be shared between several Objects.
What I'm trying to do is clean out old Object Revisions. I want to keep the first, active, and a spread of revisions so that the last change for a time period is kept. The Data might be changed a lot over the course of 2 days then left alone for months, so I want to keep the last revision before the changes started and the end change of the new set.
I'm currently using a cursor and temp table to hold the IDs and date between changes so I can select out the low hanging fruit to get rid of. This means using #LastID, #LastDate, updates and inserts to the temp table, etc...
Is there an easier/better way to calculate the date difference between the current row and the next row in my initial result set without using a cursor and temp table?
I'm on sql server 2000, but would be interested in any new features of 2005, 2008 that could help with this as well.
Here is example SQL. If you have an Identity column, you can use this instead of "ActivityDate".
SELECT DATEDIFF(HOUR, prev.ActivityDate, curr.ActivityDate)
FROM MyTable curr
JOIN MyTable prev
ON prev.ObjectID = curr.ObjectID
WHERE prev.ActivityDate =
(SELECT MAX(maxtbl.ActivityDate)
FROM MyTable maxtbl
WHERE maxtbl.ObjectID = curr.ObjectID
AND maxtbl.ActivityDate < curr.ActivityDate)
I could remove "prev", but have it there assuming you need IDs from it for deleting.
If the identity column is sequential you can use this approach:
SELECT curr.*, DATEDIFF(MINUTE, prev.EventDateTime,curr.EventDateTime) Duration FROM DWLog curr join DWLog prev on prev.EventID = curr.EventID - 1
Hrmm, interesting challenge. I think you can do it without a self-join if you use the new-to-2005 pivot functionality.
Here's what I've got so far, I wanted to give this a little more time before accepting an answer.
DECLARE #IDs TABLE
(
ID int ,
DateBetween int
)
DECLARE #OID int
SET #OID = 6150
-- Grab the revisions, calc the datediff, and insert into temp table var.
INSERT #IDs
SELECT ID,
DATEDIFF(dd,
(SELECT MAX(ActiveDate)
FROM ObjectRevisionHistory
WHERE ObjectID=#OID AND
ActiveDate < ORH.ActiveDate), ActiveDate)
FROM ObjectRevisionHistory ORH
WHERE ObjectID=#OID
-- Hard set DateBetween for special case revisions to always keep
UPDATE #IDs SET DateBetween = 1000 WHERE ID=(SELECT MIN(ID) FROM #IDs)
UPDATE #IDs SET DateBetween = 1000 WHERE ID=(SELECT MAX(ID) FROM #IDs)
UPDATE #IDs SET DateBetween = 1000
WHERE ID=(SELECT ID
FROM ObjectRevisionHistory
WHERE ObjectID=#OID AND Active=1)
-- Select out IDs for however I need them
SELECT * FROM #IDs
SELECT * FROM #IDs WHERE DateBetween < 2
SELECT * FROM #IDs WHERE DateBetween > 2
I'm looking to extend this so that I can keep at maximum so many revisions, and prune off the older ones while still keeping the first, last, and active. Should be easy enough through select top and order by clauses, um... and tossing in ActiveDate into the temp table.
I got Peter's example to work, but took that and modified it into a subselect. I messed around with both and the sql trace shows the subselect doing less reads. But it does work and I'll vote him up when I get my rep high enough.

Resources