Snowflake: Concurrent queries with CREATE OR REPLACE - snowflake-cloud-data-platform

When running a CREATE OR REPLACE TABLE AS statement in one session, are other sessions able to query the existing table, before the transaction opened by CORTAS is committed?
From reading the usage notes section of the documentation, it appears this is the case. Ideally I'm looking for someone who's validated this in practice and at scale, with a large number of read operations on the target table.
Using OR REPLACE is the equivalent of using DROP TABLE on the existing table and then creating a new table with the same name; however, the dropped table is not permanently removed from the system. Instead, it is retained in Time Travel. This is important to note because dropped tables in Time Travel can be recovered, but they also contribute to data storage for your account. For more information, see Storage Costs for Time Travel and Fail-safe.
In addition, note that the drop and create actions occur in a single atomic operation. This means that any queries concurrent with the CREATE OR REPLACE TABLE operation use either the old or new table version.
Recreating or swapping a table drops its change data. Any stream on the table becomes stale. In addition, any stream on a view that has this table as an underlying table, becomes stale. A stale stream is

I have not "proving it via performance tests to prove it happens" but we did run for 5 years, where we read from tables of on set of warehouses and rebuilts underlying tables of overs and never noticed "corruption of results".
I always thought of snowflake like double buffer in computer graphics, you have the active buffer, that the video signal is reading from (the existing tables state) and you are writing to the back buffer while a MERGE/INSERT/UPDATE/DELETE is running, and when that write transaction is complete the active "current page/files/buffer" is flipped, all going forward reads are now from the "new" state.
Given the files are immutable, the double buffer analogy holds really well (aka this is how time travel works also). Thus there is just a "global state of what is current" maintained in the meta data.
To the CORTAS / Transaction, I would assume as that is an DDL operation, if you had any transactions that it completes them, like all DDL operations do. So perhaps prior in my double buffer story, that is a hiccup to understand.

Related

Are there best practices for deduplicating records when using auto-ingest Snowpipes?

Currently in Snowflake we have configured an auto-ingest Snowpipe connected to an external S3 stage as documented here. This works well and we're copying records from the pipe into a "landing" table. The end goal is to MERGE these records into a final table to deal with any duplicates, which also works well. My question is around how best to safely perform this MERGE without missing any records? At the moment, we are performing a single data extraction job per-day so there is normally a point where the Snowpipe queue is empty which we use as an indicator that it is safe to proceed, however we are looking to move to more frequent extractions where it will become harder and harder to guarantee there will be no new records ingested at any given point.
Things we've considered:
Temporarily pause the pipe, MERGE the records, TRUNCATE the landing table, then unpause the pipe. I believe this should technically work but it is not clear to me that this is an advised way to work with Snowpipes. I'm not sure how resilient they are to being paused/unpaused, how long it tends to take to pause/unpause, etc. I am aware that paused pipes can become "stale" after 14 days (link) however we're talking about pausing it for a few minutes, not multiple days.
Utilize transactions in some way. I have a general understanding of SQL transactions, but I'm having a hard time determining exactly if/how they could be used in this situation to guarantee no data loss. The general thought is if the MERGE and DELETE could be contained in a transaction it may provide a safe way to process the incoming data throughout the day but I'm not sure if that's true.
Add in a third "processing" table and a task to swap the landing table with the processing table. The task to swap the tables could run on a schedule (e.g. every hour), and I believe the key is to have the conditional statement check both that there are records in the landing table AND that the processing table is empty. As this point the MERGE and TRUNCATE would work off the processing table and the landing table would continue to receive the incoming records.
Any additional insights into these options or completely different suggestions are very welcome.
Look into table streams which record insertions/updates/deletions to your snowpipe table. You can then merge off the stream to your target table which then resets the offset. Use a task to run your merge statement. Also, given it is snowpipe, when creating your stream it is probably best to use an append only stream
However, I had a question here where in some circumstances, we were missing some rows. Our task was set to 1min intervals, which may be partly the reason. However I never did get to the end of it, even with Snowflake support.
What we did notice though was that using a stored procedure, with a transaction and also running a select on the stream before the merge, seems to have solved the issue i.e. no more missing rows

design pattern for undoing after I have commited the changes

We can undo an action using Command or Memento pattern.
If we are using kafka then we can replay the stream in reverse order to go back to the previous state.
For example, Google docs/sheet etc. also has version history.
in case of pcpartpicker, it looks like the following:
For being safe, I want to commit everything but want to go back to the previous state if needed.
I know we can disable auto-commit and use Transaction Control Language (COMMIT, ROLLBACK, SAVEPOINT). But I am talking about undoing even after I have commited the change.
How can I do That?
There isn't a real generic answer to this question. It all depends on the structure of your database, span of the transactions across entities, distributed transactions, how much time/transactions are allowed to pass before your can revert the change, etc.
Memento like pattern
Memento Pattern is one of the possible approaches, however it needs to be modified due to the nature of the relational databases as follows:
You need to have transaction log table/list that will hold the information of the entities and attributes (tables and columns) that ware affected by the transaction with their primary key, the old and new values (values before the transaction had occurred, and values after the transaction) as well as datetimestamp. This is same with the command (memento) pattern.
Next you need a mechanism to identify the non-explicit updates that ware triggered by the stored procedures in the database as a consequence of the transaction. This is important, since a change in a table can trigger changes in other tables which ware not explicitly captured by the command.
Mechanism for rollback will need to determine if the transaction is eligible for roll-back by building a list of subsequent transactions on the same entities and determine if this transaction is eligible for roll-back, or some subsequent transaction would need to be rolled-back as well before this transaction can be rolled-back.
In case of a roll-back is allowed after longer period of time, or a near-realtime consumption of the data, there should also be a list of transaction observers, processes that need to be informed that the transaction is no longer valid since they already read the new data and took a decision based on it. Example would be a process generating a cumulative report. When transaction is rolled-back, the rollback will invalidate the report, so the report needs to be generated again.
For a short term roll-back, mainly used for distributed transactions, you can check the Microservices Saga Pattern, and use it as a starting point to build your solution.
History tables
Another approach is to keep incremental updates or also known as history tables. Where each update of the row will be an insert in the history table with new version. Similar to previous case, you need to decide how far back you can go in the history when you try to rollback the committed transaction.
Regulation issues
Finally, when you work with business data such as invoice, inventory, etc. you also need to check what are the regulations related with the cancelation of committed transactions. As example, in the accounting systems, it's not allowed to delete data, rather a new row with the compensation is added (ex. removing product from shipment list will not delete the product, but add a row with -quantity to cancel the effect of the original row and keep audit track of the change at the same time.

Updating database keys where one table's keys refer to another's

I have two tables in DynamoDB. One has data about homes, one has data about businesses. The homes table has a list of the closest businesses to it, with walking times to each of them. That is, the homes table has a list of IDs which refer to items in the businesses table. Since businesses are constantly opening and closing, both these tables need to be updated frequently.
The problem I'm facing is that, when either one of the tables is updated, the other table will have incorrect data until it is updated itself. To make this clearer: let's say one business closes and another one opens. I could update the businesses table first to remove the old business and add the new one, but the homes table would then still refer to the now-removed business. Similarly, if I updated the homes table first to refer to the new business, the businesses table would not yet have this new business' data yet. Whichever table I update first, there will always be a period of time where the two tables are not in synch.
What's the best way to deal with this problem? One way I've considered is to do all the updates to a secondary database and then swap it with my primary database, but I'm wondering if there's a better way.
Thanks!
Dynamo only offers atomic operations on the item level, not transaction level, but you can have something similar to an atomic transaction by enforcing some rules in your application.
Let's say you need to run a transaction with two operations:
Delete Business(id=123) from the table.
Update Home(id=456) to remove association with Business(id=123) from the home.businesses array.
Here's what you can do to mimic a transaction:
Generate a timestamp for locking the items
Let's say our current timestamp is 1234567890. Using a timestamp will allow you to clean up failed transactions (I'll explain later).
Lock the two items
Update both Business-123 and Home-456 and set an attribute lock=1234567890.
Do not change any other attributes yet on this update operation!
Use a ConditionalExpression (check the Developer Guide and API) to verify that attribute_not_exists(lock) before updating. This way you're sure there's no other process using the same items.
Handle update lock responses
Check if both updates succeeded to Home and Business. If yes to both, it means you can proceed with the actual changes you need to make: delete the Business-123 and update the Home-456 removing the Business association.
For extra care, also use a ConditionExpression in both updates again, but now ensuring that lock == 1234567890. This way you're extra sure no other process overwrote your lock.
If both updates succeed again, you can consider the two items updated and consistent to be read by other processes. To do this, run a third update removing the lock attribute from both items.
When one of the operations fail, you may try again X times for example. If it fails all X times, make sure the process cleans up the other lock that succeeded previously.
Enforce the transaction lock throught your code
Always use a ConditionExpression in any part of your code that may update/delete Home and Business items. This is crucial for the solution to work.
When reading Home and Business items, you'll need to do this (this may not be necessary in all reads, you'll decide if you need to ensure consistency from start to finish while working with an item read from DB):
Retrieve the item you want to read
Generate a lock timestamp
Update the item with lock=timestamp using a ConditionExpression
If the update succeeds, continue using the item normally; if not, wait one or two seconds and try again;
When you're done, update the item removing the lock
Regularly clean up failed transactions
Every minute or so, run a background process to look for potentially failed transactions. If your processes take at max 60 seconds to finish and there's an item with lock value older than, say 5 minutes (remember lock value is the time the transaction started), it's safe to say that this transaction failed at some point and whatever process running it didn't properly clean up the locks.
This background job would ensure that no items keep locked for eternity.
Beware this implementation do not assure a real atomic and consistent transaction in the sense traditional ACID DBs do. If this is mission critical for you (e.g. you're dealing with financial transactions), do not attempt to implement this. Since you said you're ok if atomicity is broken on rare failure occasions, you may live with it happily. ;)
Hope this helps!

Possible bottlenecks when inserting and updating BYTEA rows?

The project requires storing binary data into PostgreSQL (project requirement) database. For that purpose we made a table with following columns:
id : integer, primary key, generated by client
data : bytea, for storing client binary data
The client is a C++ program, running on Linux.
The rows must be inserted (initialized with a chunk of binary data), and after that updated (concatenating additional binary data to data field).
Simple tests have shown that this yields better performance.
Depending on your inputs, we will make client use concurrent threads to insert / update data (with different DB connections), or a single thread with only one DB connection.
We haven't much experience with PostgreSQL, so could you help us with some pointers concerning possible bottlenecks, and whether using multiple threads to insert data is better than using a single thread.
Thank you :)
Edit 1:
More detailed information:
there will be only one client accessing the database, using only one Linux process
database and client are on the same high performance server, but this must not matter, client must be fast no matter the machine, without additional client configuration
we will get new stream of data every 10 seconds, stream will provide new 16000 bytes per 0.5 seconds (CBR, but we can use buffering and only do inserts every 4 seconds max)
stream will last anywhere between 10 seconds and 5 minutes
It makes extremely little sense that you should get better performance inserting a row then appending to it if you are using bytea.
PostgreSQL's MVCC design means that an UPDATE is logically equivalent to a DELETE and an INSERT. When you insert the row then update it, what's happening is that the original tuple you inserted is marked as deleted and new tuple is written that contains the concatentation of the old and added data.
I question your testing methodology - can you explain in more detail how you determined that insert-then-append was faster? It makes no sense.
Beyond that, I think this question is too broad as written to really say much of use. You've given no details or numbers; no estimates of binary data size, rowcount estimates, client count estimates, etc.
bytea insert performance is no different to any other insert performance tuning in PostgreSQL. All the same advice applies: Batch work into transactions, use multiple concurrent sessions (but not too many; rule of thumb is number_of_cpus + number_of_hard_drives) to insert data, avoid having transactions use each others' data so you don't need UPDATE locks, use async commit and/or a commit_delay if you don't have a disk subsystem with a safe write-back cache like a battery-backed RAID controller, etc.
Given the updated stats you provided in the main comments thread, the amount of data you want to consume sounds entirely practical with appropriate hardware and application design. Your peak load might be achievable even on a plain hard drive if you had to commit every block that came in, since it'd require about 60 transactions per second. You could use a commit_delay to achieve group commit and significantly lower fsync() overhead, or even use synchronous_commit = off if you can afford to lose a time window of transactions in case of a crash.
With a write-back caching storage device like a battery-backed cache RAID controller or an SSD with reliable power-loss-safe cache, this load should be easy to cope with.
I haven't benchmarked different scenarios for this, so I can only speak in general terms. If designing this myself, I'd be concerned about checkpoint stalls with PostgreSQL, and would want to make sure I could buffer a bit of data. It sounds like you can so you should be OK.
Here's the first approach I'd test, benchmark and load-test, as it's in my view probably the most practical:
One connection per data stream, synchronous_commit = off + a commit_delay.
INSERT each 16kb record as it comes in into a staging table (if possible UNLOGGED or TEMPORARY if you can afford to lose incomplete records) and let Pg synchronize and group up commits. When each stream ends, read the byte arrays, concatenate them, and write the record to the final table.
For absolutely best speed with this approach, implement a bytea_agg aggregate function for bytea as an extension module (and submit it to PostgreSQL for inclusion in future versions). In reality it's likely you can get away with doing the bytea concatenation in your application by reading the data out, or with the rather inefficient and nonlinearly scaling:
CREATE AGGREGATE bytea_agg(bytea) (SFUNC=byteacat,STYPE=bytea);
INSERT INTO final_table SELECT stream_id, bytea_agg(data_block) FROM temp_stream_table;
You would want to be sure to tune your checkpointing behaviour, and if you were using an ordinary or UNLOGGED table rather than a TEMPORARY table to accumulate those 16kb records, you'd need to make sure it was being quite aggressively VACUUMed.
See also:
Whats the fastest way to do a bulk insert into Postgres?
How to speed up insertion performance in PostgreSQL

Do triggers decreases the performance? Inserted and deleted tables?

Suppose i am having stored procedures which performs Insert/update/delete operations on table.
Depending upon some criteria i want to perform some operations.
Should i create trigger or do the operation in stored procedure itself.
Does using the triggers decreases the performance?
Does these two tables viz Inserted and deleted exists(persistent) or are created dynamically?
If they are created dynamically does it have performance issue.
If they are persistent tables then where are they?
Also if they exixts then can i access Inserted and Deleted tables in stored procedures?
Will it be less performant than doing the same thing in a stored proc. Probably not but with all performance questions the only way to really know is to test both approaches with a realistic data set (if you have a 2,000,000 record table don't test with a table with 100 records!)
That said, the choice between a trigger and another method depends entirely on the need for the action in question to happen no matter how the data is updated, deleted, or inserted. If this is a business rule that must always happen no matter what, a trigger is the best place for it or you will eventually have data integrity problems. Data in databases is frequently changed from sources other than the GUI.
When writing a trigger though there are several things you should be aware of. First, the trigger fires once for each batch, so whether you inserted one record or 100,000 records the trigger only fires once. You cannot assume ever that only one record will be affected. Nor can you assume that it will always only be a small record set. This is why it is critical to write all triggers as if you are going to insert, update or delete a million rows. That means set-based logic and no cursors or while loops if at all possible. Do not take a stored proc written to handle one record and call it in a cursor in a trigger.
Also do not send emails from a cursor, you do not want to stop all inserts, updates, or deletes if the email server is down.
Yes, a table with a trigger will not perform as well as it would without it. Logic dictates that doing something is more expensive than doing nothing.
I think your question would be more meaningful if you asked in terms of whether it is more performant than some other approach that you haven't specified.
Ultimately, I'd select the tool that is most appropriate for the job and only worry about performance if there is a problem, not before you have even implemented a solution.
Inserted and deleted tables are available within the trigger, so calling them from stored procedures is a no-go.
It decreases performance on the query by definition: the query is then doing something it otherwise wasn't going to do.
The other way to look at it is this: if you were going to manually be doing whatever the trigger is doing anyway then they increase performance by saving a round trip.
Take it a step further: that advantage disappears if you use a stored procedure and you're running within one server roundtrip anyway.
So it depends on how you look at it.
Performance on what? the trigger will perform an update on the DB after the event so the user of your system won't even know it's going on. It happens in the background.
Your question is phrased in a manner quite difficult to understand.
If your Operation is important and must never be missed, then you have 2 choice
Execute your operation immediately after Update/Delete with durability
Delay the operation by making it loosely coupled with durability.
We also faced the same issue and our production MSSQL 2016 DB > 1TB with >500 tables and need to send changes(insert, update, delete) of few columns from 20 important tables to 3rd party. Number of business process that updates those few columns in 20 important tables were > 200 and it's a tedious task to modify them because it's a legacy application. Our existing process must work without any dependency of data sharing. Data Sharing order must be important. FIFO must be maintained
eg User Mobile No: 123-456-789, it change to 123-456-123 and again change to 123-456-456
order of sending this 123-456-789 --> 123-456-123 --> 123-456-456. Subsequent request can only be send if response of first previous request is successful.
We created 20 new tables with limited columns that we want. We compare main tables and new table (MainTable1 JOIN MainTale_LessCol1) using checksum of all columns and TimeStamp Column to Identify change.
Changes are logged in APIrequest tables and updated back in MainTale_LessCol1. Run this logic in Scheduled Job every 15 min.
Separate process will pick from APIrequest and send data to 3rd party.
We Explored
Triggers
CDC (Change Data Capture)
200+ Process Changes
Since our deadlines were strict, and cumulative changes on those 20 tables were > 1000/sec and our system were already on peak capacity, our current design work.
You can try CDC share your experience

Resources