Does CREATE OR REPLACE statement affect time travel in snowflake? - snowflake-cloud-data-platform

If I create a table in snowflake and then create another one with the same name using CREATE OR REPLACE statement, I am not able to access the content of the first table using time travel.
For example, if I run this code
CREATE TABLE "MY_DB"."MY_SCHEMA"."MY_TABLE" (COL1 VARCHAR, COL2 NUMBER);
INSERT INTO "MY_DB"."MY_SCHEMA"."MY_TABLE" VALUES ('A',1);
... and then in five minutes run this chunk of code
CREATE OR REPLACE TABLE "MY_DB"."MY_SCHEMA"."MY_TABLE" (COL1 VARCHAR, COL2 NUMBER);
INSERT INTO "MY_DB"."MY_SCHEMA"."MY_TABLE" VALUES ('B',2);
SELECT * FROM "MY_DB"."MY_SCHEMA"."MY_TABLE"
UNION
SELECT * FROM "MY_DB"."MY_SCHEMA"."MY_TABLE" AT (offset => -60*1)
The query only returns the values from the second table. Is this behavior expected? I tried to google this or find clarification in snowflake documentation without any luck...
Thank you

Update: See #peterb answer "restore the old table by using time travel at the schema level"
As Mike says, the previous table was dropped with the CREATE AND REPLACE, so time travel won't be able to find it.
Nevertheless, you can still recover the data by renaming the newer table, and undropping the previous one:
CREATE TABLE "test_travel" (COL1 VARCHAR, COL2 NUMBER);
INSERT INTO "test_travel" VALUES ('A',1);
CREATE OR REPLACE TABLE "test_travel" (COL1 VARCHAR, COL2 NUMBER);
INSERT INTO "test_travel" VALUES ('B',2);
ALTER TABLE "test_travel" RENAME TO "test_travel2";
UNDROP TABLE "test_travel";
SELECT *
FROM "test_travel" AT (offset => -60*1)
https://docs.snowflake.com/en/user-guide/data-time-travel.html#restoring-objects

You can also restore the old table by using time travel at the schema level to the time before you ran Create or Replace Table:
create schema "MY_DB"."MY_SCHEMA_RESTORED" clone "MY_DB"."MY_SCHEMA" AT (offset => -60*1)

Yes - this is expected. By running a CREATE OR REPLACE, you've dropped the original table and replaced it with a new table. There will be no time-travel data from before the replace table was issued.
If you wish to remove the data from a table and retain the data for time-travel purposes, leverage a TRUNCATE TABLE statement, instead.
https://docs.snowflake.com/en/sql-reference/sql/truncate-table.html

Related

How can split a row of data and insert each row (but different columns) into two tables (with a FK relationship) in SSIS?

I have two tables in SQL Server:
Person
ID (PK, int, IDENTITY)
Name (varchar(100))
UploadedBy (varchar(50))
DateAdded (datetime)
PersonFile
ID (PK, int, IDENTITY)
PersonId (FK, int)
PersonFile (varchar(max))
I am reading in a large file (150MB), and I have a script component that can successfully parse the file into several columns. The issue is that I need to insert the first 3 columns of my parsed data row into my Person table first, then use the ID of that Row to insert the final column into my PersonFile table. Is there an easy way to do this in SSIS?
I suppose I could technically script everything out to handle inserts in the database, but I feel like in that case, I might as well just skip SSIS altogether and user powershell. I also thought about writing a procedure in SQL server and then passing the information to the procedure to handle inserts. But again, this seems very inefficient.
What's the best way for me to insert a row of data into two tables, if one of them has a foreign key constraint?
I think the best way is to use a stage table in the database to hold the parsed source file and then use stored procedures or SQL-query to load your tables. There is a lookup component in SSIS that can be used for your case but I try avoiding it for various reasons.
Create a table resembeling the source file, something like:
CREATE TABLE dbo.[SourceFileName](
Name nvarchar(100) NULL,
UploadedBy nvarchar(50) NULL,
DateAdded datetime NULL,
PersonFile nvarchar(max) NULL
)
Truncate the stage table. Use a dataflow component to get the source data. Use script or stored procedures to insert the source data in your destination table (begin with Person and the load PersonFile). Your SSIS dataflow should look something like this:
For the insert script for person do something like:
INSERT INTO dbo.Person (Name, UploadedBy,DateAdded)
SELECT Name,UploadedBy,DateAdded
FROM dbo.SourceFileName;
For the insert for PersonFile make a join to the destination table:
INSERT INTO dbo.PersonFile(PersonId,PersonFile)
SELECT
Person.ID,
SourceFile.PersonFile
FROM dbo.SourceFileName SourceFile
JOIN dbo.Person Person
ON Person.Name = SourceFile.Name
You should also add a UNIQUE CONSTRAINT to the column that identifies the person (Name for example).
One very common thing to do would be to stage the data first.
So you insert all columns into a table on the server, which also has an extra nullable column for the PersonID.
Then you’d have a stored procedure which inserts unique Person records into the Person table, and updates the staging table with the resulting PersonID, which is the extra field you need for the PersonFile insert, which could then be performed either in the same procedure or another one. (You’d call these procedures in SSIS with an Execute SQL Task.)
I suppose this could possibly be done purely in SSIS, for example with a Script Destination that performs an insert and retrieves the PersonID for a second insert, but I’m fairly sure performance would take a huge hit with an approach like that.

In SQL Server / Liquibase, does INSERT query create a table, automatically without CREATE schema, if it does not exist?

I am using Liquibase for managing SQL Server scripts (create, update, delete, alters etc.).
My requirement was to create a backup table (say old_table_a) before I could drop two columns (column_1, column_2) from the original table (table_a).
The new backup table does not need a primary key, so it will just have two columns as shown below
old_table_a
column_1 (from original table_a)
column_2 (from original table_a)
If I just write INSERT query as shown below, without having a CREATE TABLE old_table_a
INSERT INTO old_table_a (column_1, column_2)
SELECT column_1, column_2
FROM table_a
I had read this somewhere on some blog, but cannot find this.
Please provide some information if this is possible.
Otherwise I know that the usual way to do this is to create the new backup table and then populate the new table with values from the original.
This can be done with SELECT * INTO:
SELECT * INTO [NEWTABLE] FROM [OLDTABLE]
INSERT tableName1 (ColumName)
(select (ColumName ) from TableName2)

SQL Server - Create temp table if doesn't exist

In my SQL Server 2012 environment, I've created a series of stored procedures that pass pre-existing temporary tables among themselves (I have tried different architectures here, but wasn't able to bypass this due to the nature of the requirements / procedures).
What I'm trying to do is to, within a stored procedure check if a temporary table has already been created and, if not, to create it.
My current SQL looks as follows:
IF OBJECT_ID('tempdb..#MyTable') IS NULL
CREATE TABLE #MyTable
(
Col1 INT,
Col2 VARCHAR(10)
...
);
But when I try and run it when the table already exists, I get the error message
There is already an object named '#MyTable' in the database
So it seems it doesn't simply ignore those lines within the If statement.
Is there a way to accomplish this - create a temp table if it doesn't already exist, otherwise, use the one already in memory?
Thanks!
UPDATE:
For whatever reason, following #RaduGheorghiu's suggestion from the comments, I found out that the system creates a temporary table with a name along the lines of dbo.#MyTable________________________________________________0000000001B1
Is that why I can't find it? Is there any way to change that? This is new to me....
Following the link here, http://weblogs.sqlteam.com/mladenp/archive/2008/08/21/SQL-Server-2005-temporary-tables-bug-feature-or-expected-behavior.aspx
It seems as though you need to use the GO statement.
You meant to use IS NOT NULL i think... this is commonly used to clear temp tables so you don't get the error you mentioned in your OP.
IF OBJECT_ID('tempdb..#MyTable') IS NOT NULL DROP TABLE #MyTable
CREATE TABLE #MyTable
(
Col1 INT,
Col2 VARCHAR(10)
);
The big difference is the DROP TABLE statement after you do your logical check. Also, creating your table without filling data doesn't make it NULL
DROP TABLE #MyTable
CREATE TABLE #MyTable
(
Col1 INT,
Col2 VARCHAR(10)
);
IF OBJECT_ID('tempdb..#MyTable') IS NOT NULL
SELECT 1
Try wrapping your actions in a begin...end block:
if object_id('tempdb..#MyTable') is null
begin
create table #MyTable (
Col1 int
, Col2 varchar(10)
);
end
This seems odd, but it works when I try it
IF(OBJECT_ID('tempdb..#Test') IS NULL) --check if it exists
BEGIN
IF(1 = 0)--this will never actually run, but it tricks the parser into allowing the CREATE to run
DROP TABLE #Test;
PRINT 'Create table';
CREATE TABLE #Test
(
ID INT NOT NULL PRIMARY KEY
);
END
IF(NOT EXISTS(SELECT 1 FROM #Test))
INSERT INTO #Test(ID)
VALUES(1);
SELECT *
FROM #Test;
--Try dropping the table and test again
--DROP TABLE #Test;

SQL Server time stamp column insertion or updation possible explicitly?

Is there any way to provide an explicit value for time stamp column in a table in SQL server? I am aware it is not datetime column but I want to know whether there is any way to insert or update it explicitly.
You cannot insert/update to timestamp column explicitly. They are generated automatically, when you perform insert/update to the table.
Because the timestamps appear to be representations of timestamps created by the database when you inserted or updated the column, in effect you would have to change the original timestamp created by the database in order to define them explicitly.
From your second comment I appreciate that you might have data coming in which is already timestamped and you just want those represented on your table in the same way as inserting data with "set identity_insert on" .
The answer would be to select the existing table into another table then add the incoming data. If you run the code below I think you'll see what I mean.
create table abc
(
col1 int, timestamp
)
go
insert into abc(col1) values (1)
go
select col1,convert(varbinary,timestamp) timestamp# into def from abc
go
select * from abc
select * from def
As far as I know the timestamp represents a row version number (which is why they change when you update a value in the row because you are creating another version of the row). There might be a date in the transaction log which states when this version of the row came into being. I don't consider it possible to directly convert timestamp to datetime.
Well..the only other idea I have is to add another column and then select the timestamp values into that! The weirdest thing, in doing this it takes the last character back one! See what you think.
drop table abc
go
create table abc
(
col1 int, timestamp
)
go
insert into abc(col1) values (1)
go
alter table abc add timestamp# varbinary(18)
go
update abc set timestamp# = convert(varbinary,timestamp)
Generaly speaking, when creating a table I would include a column which defaults to datetime, this way you have a datetime when each row is created.
Like this:
drop table def
go
create table def
(
col1 int,
idt datetime default getdate()
)
If you insert a value into col1 and do not include the idt in your column list in the insert statement the idt column will default to the datetime you inserted the value.
Like this:
insert into def (col1) values (1)

How to get both the new pk and old pk when inserting a subset of a table into itself?

I'm inserting a subset of a table into the same table and in order to create records in some mapping tables need to capture both the newly created identity PK, and the matching old PK..
If SQL would support it, something like:
Create table Test (pk identity, description varchar(10))
Declare #PKVALUES TABLE (NewPK int, OLdPk int)
INSERT INTO Test (description)
OUTPUT INSERTED.PK, Test.PK into #PKVALUES
Select description
From Test
Where ...
But, of course, SQL doesn't support Output of values from the FROM table during an INSERT operation..
The only set based alternative I've come across requires locking the whole table while creating the new PKs in a temporary table and then inserting them into the Test table using identity insert.
Is there some way I can accomplish this, (without having to resort to a one record at a time
approach or having to lock the whole table) ?
Thanks,
Ilmar
My preference would be to add a column to store the old pk in and then you can return it from the output clause. However, it is not always possible to change the table.
So, I have a sneaky trick but it involves doing twice as much work on your db. What you do is put the Old PK in the description field in the intial insert. Then you update the description to the value of teh old PK by joining on the description field to the PK.
Create table Test (pk identity, description varchar(10))
Declare #PKVALUES TABLE (NewPK int, OLdPk varchar(10)
INSERT INTO Test (description)
OUTPUT INSERTED.PK, INSERTED.Description into #PKVALUES
SELECT PK from Test where....
UPDATE tnew
SET description = told.description
FROM test told
JOIN test tnew ON CAST(told.PK AS varchar (10)) = t.description

Resources