I create a table with explicit DATA_RETENTION_TIME_IN_DAYS zero, i.e. no time travel. UNDROP on a dropped table should not restore the table. However, it does!
create or replace table t1 (col1 string)
DATA_RETENTION_TIME_IN_DAYS=0; -- yes, no time travel please
insert into t1 values ('abc');
drop table t1;
undrop table t1; -- hey, this works!
select * from t1; -- this returns my table with the row inserted!
DROP TABLE will remove any explicit DATA_RETENTION_TIME_IN_DAYS you have set. The dropped table will now inherit whatever similar parameter value you have in the parent schema, database, or account. This is usually 1 by default. Or, in my case, it was 5 at the database level. Run show tables history like 't1'.
This means you've automatically got time travel, and all your data will now be saved, only when the table is dropped, with no other previous history data.
If you intend to drop tables and to rely on UNDROP, you should rather set the retention time parameter at the schema or database level:
create or replace database db1
DATA_RETENTION_TIME_IN_DAYS=0; -- no time travel for any table here!
create or replace table t1 (col1 string); -- inherited, as zero
insert into t1 values ('abc');
drop table t1;
undrop table t1; -- now this fails, as expected
UNDROP fails now, as expected, with "Table T1 does not exist or was purged". Because t1 still inherits retention time 0 from its parent database.
Later Edit: if your database has no time travel, but your table has, expect to lose not just all historical data when you drop the table, but all table data as well. Which can not be restored.
Related
Scenario: Adding column in Table, using UPDATE to populate data and then drop other column does not free space
Note: My Warehouse configuration is XL and is auto terminate after 5 minutes
Tables:
"database"."schema"."table1"
-- ID varchar(32), eg: "ajs6djnd79dhashlj172883gdb4av3"
-- ........
"database"."schema"."id_dim"
-- ID varchar(32) eg: "ajs6djnd79dhashlj172883gdb4av3"
-- ID_NUM NUMBER(12, 0) AUTOINCREMENT START 1 INCREMENT 1 eg: 1
ALTER TABLE "database"."schema"."table1" ADD ID_NUM NUMBER(12, 0);
UPDATE "database"."schema"."table1" e1
SET e1.ID_NUM = d2.ID_NUM
FROM "database"."schema"."id_dim" d2
WHERE e1.id = d2.id;
ALTER TABLE "database"."schema"."table1" DROP ID;
ALTER TABLE "database"."schema"."table1" RENAME COLUMN ID_NUM TO ID;
Q: I am still seeing that after UPDATE operation and column drop, memory consumption is more as compared to previous table size and in Snowflake doc it says that micro-partitions is written after DML operation.
Exactly, you are right: A new micro-partition is written after your DML operation.
But: This does not mean the old micro-partition is dropped. Here Time Travel comes into play and the older version is still stored.
https://docs.snowflake.com/en/user-guide/data-time-travel.html
How long the old data is stored? This depends on your table type as well as the value of your parameter DATA_RETENTION_TIME_IN_DAYS for the object: https://docs.snowflake.com/en/sql-reference/parameters.html#data-retention-time-in-days
In Postgresql 10 , I want to have same set of columns for audit purpose in all transactional tables of a particular database with same Foreign Key Constraints.
I am thinking of creating a master table with the set of 4 columns:
createdBy createdOn updatedBy updatedOn
Then inherit all transactional tables from this master table.
Is this the right approach and is inheritance suited for this? When it comes to storage of data, how it works behind the scenes when I insert records into the derived/child tables. What happens when data is deleted from child tables. Can I lock my master table so that no one accidentally deletes any records from master table ?
I see no problem with that approach but it works differently from your description.
I will use the following tables for illustration:
CREATE TABLE MasterAudit (
createdBy TEXT DEFAULT current_user,
createdOn TIMESTAMP WITH TIME ZONE DEFAULT current_timestamp,
updatedBy TEXT DEFAULT current_user,
updatedOn TIMESTAMP WITH TIME ZONE DEFAULT current_timestamp
);
CREATE TABLE SlaveAudit (
Val Text
) INHERITS(MasterAudit);
This definition allows to skip columns when inserting / use the default keyword for inserts and updates.
What does SELECT do (visible when using EXPLAIN)?
Behind the scene, data inserted into SlaveAudit is stored into SlaveAudit; selecting from MasterAudit works with UNION of tables, including MasterAudit itself (it is valid to insert data into the parent table, although it would not make much sense in this very case).
SELECT * FROM SlaveAudit reads data from SlaveAudit. The additional column Val from SlaveAudit is returned.
SELECT * FROM MasterAudit reads data from MasterAudit UNION SlaveAudit. The additional column Val is not returned.
SELECT * FROM ONLY MasterAudit reads data from MasterAudit only.
Illustration aside, the correct way to select from MasterAudit is by using the pseudo-column tableoid in order to determine where each record comes from.
Be careful though, it can be very long to get if all your tables inherit from MasterAudit
SELECT relname, MasterAudit.*
FROM MasterAudit
JOIN pg_class ON MasterAudit.tableoid = pg_class.oid
Let's insert stuff.
INSERT INTO SlaveAudit(Val) VALUES ('Some value');
What query will result in deleting it?
DELETE FROM SlaveAudit will remove that record (obviously).
DELETE FROM MasterAudit will remove the record too. Oops! that is not what we want.
TRUNCATE TABLE SlaveAudit and TRUNCATE Table MasterAudit will have the same result as the 2 DELETE.
Time to manage access.
IMHO, no commands apart from SELECT should ever be granted on MasterAudit.
Creating a table that inherits MasterAudit can only be done by its owner. You may want to change the tables' owner.
ALTER TABLE MasterAudit OWNER TO ...
Almost all the privileges must be revoked. It includes the table owner (but please note the super user will not be affected). SELECT on MasterAudit may be granted to everyone if you want.
REVOKE ALL ON MasterAudit FROM public, ...
GRANT SELECT ON MasterAudit TO public
Check the access by ensuring the following queries fail:
INSERT INTO MasterAudit VALUES(default, default, default, default)
DELETE FROM MasterAudit
Is it possible to access the new values for inserted records in a transaction from a trigger on a different table in the same transaction?
When you cause a trigger to fire within a transaction the trigger runs in a nested transaction, so can see all the rows previously written in the transaction.
eg
create table t1(id int)
create table t2(id int)
go
create trigger tt2 on t2 after insert
as
begin
select * from t1;
end
go
begin transaction
insert into t1(id) values (1)
insert into t2(id) values (1)
rollback
outputs
(1 row affected)
id
-----------
1
(1 row affected)
(1 row affected)
So you can see all the records in t1 from a trigger on t2, including any rows affected by the current transaction. But there is no inherent way to tell which rows in t1 were affected by the current transaction.
And you could easily cause deadlocks doing this.
A few possible solutions would be
create a column (TransactionID UniqueIdentifier) in both tables, generate a new GUID just after transaction has started and then insert that ID into both tables. Then, in a trigger, you get TransactionID from Inserted, and read the second table WHERE TransactionID = ... Consider indexes if those tables are large.
use OUTPUT clause for the first insert into some supplementary table. Use that table inside the trigger to know the new IDs from that transaction. Truncate that supplementary table just before committing the transaction. Don't use that table for any other processes / code.
PS. For (1), you can use INT or BIGINT for TransactionID, but you need to use some mechanism to generate a new unique ID
If I execute a procedure that drops a table and then recreate it using 'SELECT INTO'.
IF that procedure raises an exception after dropping the table, does table dropping take place or not?
Unless you wrap them in a transaction,table will be dropped since each statement will be considered as an implicit transaction..
below are some tests
create table t1
(
id int not null primary key
)
drop table t11
insert into t1
select 1 union all select 1
table t11 will be dropped,even though insert will raise an exception..
one more example..
drop table orderstest
print 'dropped table'
waitfor delay '00:00:05'
select * into orderstest
from Orders
now after 2 seconds,kill session and you can still see orderstest being dropped
I checked with some other statements other than select into ,i don't see a reason why select into will behave differently and this applies even if you wrap statements in a stored proc..
IF you want to rollback all,use a transaction or more better use set xact_Abort on
Yes, the dropped table will be gone. I have had this issue when I script a new primary key. Depending on the table, it saves all the data to a table variable in memory, drops the table, creates a new one with the new pk, then loads the data. If the data violates the new pk, the statement fails and the table variable is dropped leaving me with a new table and no data.
My practice is to create the new table with a slightly different name, load the data, change both table names in a statement, then once all the data is confirmed loaded, drop the original table.
I am not sure if this question is an obvious one. I need to delete a load of data. Delete is expensive. I need to truncate the table but not fully so that the memory is released and watermark is changed.
Is there any feature which would allow me to truncate a table based on a condition for select rows?
Depends on how your table is organised.
1) if your (large) table is partitioned based on similar condition ( eg. you want to delete previous month's data and your table is partitioned by month), you could truncate only that partition, instead of the entire table.
2) The other option, provided you have some downtime, would be to insert the data that you want to keep into a temporary table, truncate the original table and then load the data back.
insert into <table1>
select * from <my_table>
where <condition>;
commit;
truncate table my_table;
insert into my_table
select * from <table1>;
commit;
--since the amount of data might change considerably,
--you might want to collect statistics again
exec dbms_stats.gather_table_stats
(ownname=>'SCHEMA_NAME',
tabname => 'MY_TABLE');