Adding last change timestamp to a table in snowflake - snowflake-cloud-data-platform

I have a lots of tables in Snowflake that I am updating them ( basically re-creating them) every day with a python script.
I can see the timestamp of the last time those tables have been changed in information schema of my database but how can I add the column or that information to one of our tables?
Assume that I have a table customer and I want to be able to see when was the last time that each row of that table has been changed. I can see this timestamp here:
SELECT CONVERT_TIMEZONE('Etc/GMT+9','UTC',last_altered) AS last_changed
FROM "XXXX"."INFORMATION_SCHEMA"."TABLES"
WHERE table_name='CUSTOMERS';
how to add this information to customer table?

If you would like to see that information your python program should add that information as additional columns in each row. We used to call these columns as 'WHO COLUMNS', below are the WHO COLUMNS that we added to each table in the final schema
Last Updated TimeStamp
Last Updated User
Creation Timestamp

The best option would be to add an additional audit column to the customer table with a default value as current_timestamp
Example:
CREATE TABLE CUSTOMER (column1 varchar, insert_date timestamp default current_timestamp())
In this example you can use insert_date to track when that record is inserted. The column would be auto-populated whenever you are inserting a row like this.
INSERT INTO CUSTOMER(column1) VALUES ('test')

Related

SQL Server Insert If Not Exists - No Primary Key

I have Table A and Table B.
Table A contains data from another source.
Table B contains data that is inserted from Table A along with data from other tables. I have done the initial insert of data from A to B but now what I am trying to do is insert the records that do not exist already in Table B from Table A on a daily basis. Unfortunately, there is no primary key or unique identifier in Table A which is making this difficult.
Table A contains a field called file_name which has values that looks like this:
this_is_a_file_name_01011980.txt
There can be duplicate values in this column (multiple files from the same date).
In Table B I created a column data_date which extracts the date from the table a.file_name field. There is also a load_date field which just uses GETDATE() at the time the data is inserted.
I am thinking I can somehow compare the dates in these tables to decide what needs to be inserted. For example:
If the file date from Table A (would need to extract again) is greater than the load_date of Table B, then insert these records into Table B.
Let me know if any clarification is needed.
You could use exists or except. With the explanation here it seems like except would make short work of this. Something like this.
insert tableB
select * from tableA
except
select * from tableB

SSIS Lookup Suggestion

I have a task to Generate a Derived Column (RestrictionID) from a table and add it to a source table if 6 columns on both tables match. The 6 columns include (Country, Department, Account .etc) I decided to go with the SSIS Lookup and generate the Derived column when there's a match. This ID also has a time and amount Limit. Once all the records have ID's, I'm supposed to calculate a running total based on the ID to enforce the limits which is the easy part.
The only problem is this Lookup table changes almost daily and any or all of the 6 columns can have NULLS. Even the completely null rows have an Id. Nulls mean the Restriction is open. eg. If the Country column on one record on the lookup table is null, then the ID of that record can be assigned to records with any country on the source. If one row on the lookup has all null columns, then this is completely open and all records on the source qualify for that ID. The Source table doesn't have NULLS.
Please assist if possible
Thanks
If NULL means any and ignore column in lookup then add this to your where:
use a stored proc and pass your values in and return:
select lookup.ID
from lookup
where #Country = isnull(lookup.Country,#Country) //If lookup inull then it refers to itself creating a 1=1 scenario
and #Department = isnull(lookup.Department,#Department)
and ...

Identity column not incremented by 1?

I've created 4 tables:
`Patient` (Id, Name, ..)
`Donor` (Id, Name, ..)
`BloodBank` (Id, Name, ..)
BloodBankDonors(DonorId, BloodBankId, ..)
And set the Id columns to Identity incremented by 1, seed 1. and made a relationship between (Donor, BloodBank) and (BloodBankDonors).
The problem is when I entered some data in the tables BloodBank and the patient, the auto generated Id column was: 1,3,4 and 1,4,5,8 respectively?!
So many things can cause gaps in an IDENTITY column. For example rollbacks not resetting IDENTITY, deletes, etc.
So, why do you care about gaps? You shouldn't. If you need a contiguous sequence of numbers, stop using IDENTITY.
You might deleting (DELETE command) some records from tables "BloodBank" and the "patient".Deleting record from table holds the log info of column ID(auto generated column) for recovery Purpose. Instead use below mentioned code snippet after "DELETE" command:
DBCC CHECKIDENT('databasename.dbo.tablename', RESEED, number)
if number=0 then in the next insert the auto increment field will contain value 1
if number=101 then in the next insert the auto increment field will contain value 102.
for more clear answer, please share sql script which you are using to create tables and insert records.
Deleting data from table holds the log info holding ID (auto generated columns) for recovery purpose.
Try to truncate table and re-enter the data
truncate table Patient
May this help

SQL Server time stamp column insertion or updation possible explicitly?

Is there any way to provide an explicit value for time stamp column in a table in SQL server? I am aware it is not datetime column but I want to know whether there is any way to insert or update it explicitly.
You cannot insert/update to timestamp column explicitly. They are generated automatically, when you perform insert/update to the table.
Because the timestamps appear to be representations of timestamps created by the database when you inserted or updated the column, in effect you would have to change the original timestamp created by the database in order to define them explicitly.
From your second comment I appreciate that you might have data coming in which is already timestamped and you just want those represented on your table in the same way as inserting data with "set identity_insert on" .
The answer would be to select the existing table into another table then add the incoming data. If you run the code below I think you'll see what I mean.
create table abc
(
col1 int, timestamp
)
go
insert into abc(col1) values (1)
go
select col1,convert(varbinary,timestamp) timestamp# into def from abc
go
select * from abc
select * from def
As far as I know the timestamp represents a row version number (which is why they change when you update a value in the row because you are creating another version of the row). There might be a date in the transaction log which states when this version of the row came into being. I don't consider it possible to directly convert timestamp to datetime.
Well..the only other idea I have is to add another column and then select the timestamp values into that! The weirdest thing, in doing this it takes the last character back one! See what you think.
drop table abc
go
create table abc
(
col1 int, timestamp
)
go
insert into abc(col1) values (1)
go
alter table abc add timestamp# varbinary(18)
go
update abc set timestamp# = convert(varbinary,timestamp)
Generaly speaking, when creating a table I would include a column which defaults to datetime, this way you have a datetime when each row is created.
Like this:
drop table def
go
create table def
(
col1 int,
idt datetime default getdate()
)
If you insert a value into col1 and do not include the idt in your column list in the insert statement the idt column will default to the datetime you inserted the value.
Like this:
insert into def (col1) values (1)

Finding out the data a row has been inserted into a table

Is there a way to find out the data a row has been inserted (into a SQL Server 2005, 2008 or 2008 r2) database table? Without setting up auditing (either ootb or custom 3rd party product).
Thanks
You can always create a trigger on that table. Like so:
create trigger InsertNotification
on YourTable
after insert
as
-- do whatever you want when an insert happens
go
This can definitely be seen as a form of "auditing", but I'm not familiar with "ootb", nor is this a 3rd party product. Triggers are the way to go.
Well if you want to be notified when the row is inserted make insert trigger to this table.
If you just want to save the information when a specific row was inserted, you could just create a new datetime or smalldatetime column with getdate() as the default value.
Whenever a new row is inserted, this column will be automatically filled with the current date/time.
Advantages:
no trigger or 3rd party tool needed
Disadvantages:
this only works for new tables (or all new records in existing tables). If a table already has existing records, you won't have an insert date/time for them
if you want this for all your tables, you have to insert the column into each table

Resources