Two Phase Commit vs Three Phase Commit - distributed-transactions

I am currently studying 2 phase and 3 phase commit.
The 3PC protocol tries to eliminates the 2PC protocol’s system blocking problem by adding an extra phase, preCommit. As mentioned here
According to this post, If the co-ordinator crashed at any point, a recovery node can take over the transaction and query the state from any remaining replicas. For example, if any remaining replicas replied to the recovery node it is in pre-commit state, then recovery node will know that the failed coordinator has sent pre-commit message, and all replicas has agreed to commit.
My question is, Why can't Two phase commit do the same thing? When the coordinator failed, the recovery node query those remaining nodes and see any of them already in Commit phase?
I have read server posts but still I don't what exact problem 3 phase commit is trying to solve and how it is solved?
Please help!

Related

2 phase commit data consistency when one node crashes or is slow during commit phase

In 2 phase commit, lets say one of the nodes crashes or is slow during the commit phase and the other nodes have already successfully committed their changes. Now if a query is issued to one of those successful nodes, what will be its data consistency as visible from the client perspective?
Does the client see the latest committed changes on that successful node or does the transaction coordinator pass the previous successful TransactionID along with the query to the node so that the client can only see the data/state prior to the current ongoing transaction?

How does flink deal with the opened transaction when failure happens

I am using Flink 1.12.0, and I got a question about Flink 2PC mechanism for end-to-end consistency guarantee.
At the start of checkpoint, a transaction is opened, and the transaction is committed after the successful completion of checkpoint.
Then what happens if failure happens? I think the opened transaction should be rolled back? also when the transaction is rolled back? Thanks.
Because the operators and Task Managers are distributed inside a cluster, Flink has to ensure that all components agree together in order to claim that a commit is successful. Flink uses the 2 phase commit protocol, as you said, and with a pre-commit. The pre-commit is the key to deal with failures during the checkpoint, as it says on the documentation
The pre-commit phase finishes when the checkpoint barrier passes through all of the operators and the triggered snapshot callbacks complete.

How to avoid executing database transactions on dirty data?

I am trying to find an architectural approach to handling the following concurrency issue:
Multiple users may submit database transactions to the same subset of a database table in a relational database simultaneously, each with different transactions. In this scenario each transaction will run in Isolation level Serializable thus ensuring that the transactions are handled as if they occurred one after the other but this does not solve my specific concurrency issue...
Transaction 1 originates from User 1 who has through the Application made a batch of inserts, updates and deletes on a subset of a table.
User 2 may have started editing the data in the Application any time prior to the commit from Transaction 1 and thus is editing dirty data. Transaction 2 thus originates from User 2 who has (unknowingly) made a batch of inserts, updates, deletes on a the same subset of the table as User 1 BUT these are likely to overwrite and in some instances overlap with the changes made in Transaction 1. Transaction 2 will not fail based on MVCC but is not sensible to perform.
I need Transaction 2 to fail (ideally not even start) due to the data in the database (after the Transaction 1 commit) not being in the state that it was in when User 2 received his/her initial data to work on.
There must be "standard" architectural patterns to achieve my objective - any pointers in the right direction will be much appreciated.
Transaction 2 will not fail based on MVCC
If the transactions are really done in serializable isolation level, then transaction 2 will fail if the transaction includes both the original read of the data (since changed), and the attempted changes based on it. True serializable isolation was implemented way back in PostgreSQL 9.1.

How to clean a transaction log in integration flow of HCP PI?

I'm running an integration flow which processing actions are on hold due to the following error:
com.sybase.jdbc4.jdbc.SybSQLWarning: The transaction log in database <database_name> is almost full. Your transaction is being suspended until space is made available in the log.
How can I erase the log or increase its size?
Thank you
From my understanding this message is related to your SybSQL Database. This is not related to HCI. So you should clear the database log.
On HCI side you cannot delete any log or influence log sizes. I had a quite similar request a while ago. I clarified with SAP Support and it is not possible to delete any log entries manually. Furthermore in the meantime I found that the log messages are deleted automatically after 6 months.

Cassandra Transaction with ZooKeeper - Does this work?

I am trying to implement a transaction system for Cassandra with the help of ZooKeeper. Since I don't think I have enough experience in database implementation, I would like to know if my idea would work in principle, or is there any major flaw.
Here is the high level description of the steps:
identify all the rows(keys) and columns to be edited. Let the keys be [K0..Kn]
apply write lock on all the rows involved (locks are in-memory Zookeeper implementation)
copy the old values to separate locations in Cassandra which are uniquely identified by key: [K'0..K'n]
store [K'0..K'n] and the mapping of them to [K0..Kn] in ZooKeeper using persistent mode
go ahead apply the update to the data
delete the entries in ZooKeeper
unlock the rows
delete the entries of [K'0..K'n] lazily on a maintenance thread (cassandra deletion uses timestamp, so K'0..K'n can be reused for another transaction with a newer time stamp)
Justification:
if the transaction failed on step 1-4, no change is applied, I can abort the transaction and delete whatever is stored in zookeeper and backup-ed in cassandra, if any.
if the transaction failed on step 5, the information saved on step 3 is used to rollback the any changes.
if the server happen to be failed/crashed/stolen by cleaning man, upon restart before serving any request, I check if there is any keys persisted in the zookeeper from step 4, if so, i will use those keys to fetch backed up data stored by step 3, and put those data to where they were, thus roll-back any failed transactions.
One of my concern is what would happen if some of the servers are partitioned from the cluster. I have no experience in this area, does my scheme work at all? and does it work if partition happens?
You should look into Cages: http://ria101.wordpress.com/2010/05/12/locking-and-transactions-over-cassandra-using-cages/
http://code.google.com/p/cages/

Resources