I have a question about 3 phase commit that I am trying to use for our application.
In our application, clients are load balanced to a frontend service that act as a coordinator to handle transaction with our backend system. Since the frontend is going to write to multiple database in the backend, I think 3 phase commit would be a good way to handle it.
However, there is a potential issue:
Client A connect to frontend1. frontend1 start a 3 phase commit transaction with backend.
At the same time, client A disconnects, reconnects to frontend2 and retry the transactions.
Frontend 1 and frontend2 will be racing together.
I could not find documentation that would cover this kind of problem? Is there some common pattern that covers it?
Thank you!
Related
In my application while creating new user record via POST/users REST webapi endpoint, below are the steps involved
step 1: create entry in Azure Active Directory(AAD)
step 2: on success of step:1, send email notification to the concern user email id about user creation in our application (this functionality has been implemented using Azure storage queue & sendgrid). At present failure of step2 wont rollback step1
step 3: on success of step:2, create user document in our Azure cosmos db.At present failure of step3 wont rollback step2 or step1
All these above 3 steps has to be treated as unit of work they must be all success or all fail together like db transaction/rollback. There shouldn't be a situation one step get succeed while other steps failed as it will end up in creating dirty records.
What would be the best/effective ways of achieve all success or all fail together kind of implementation, for the above mentioned 3 steps which involves 3 various Azure resources?
There is no mechanism I'm aware of to perform transactions across different Azure resources like you describe. The only place true transactions apply is with the database services and perhaps some others, but not across disparate services.
Given this, your code would need to handle failure cases and cleanups appropriately.
Even Cosmos DB alone only has very limited transaction support. If your transactions span several distributed services. you will need to implement your own transactions. Often however, it's not worth the trouble. Consider if it would be sufficient to periodically scan for any inconsistencies and clean up then.
If you do need transactions, you could implement a Resource Manager to use the .NET features from System.Transactions. See here for a starting point.
I'm working on a web application which it's sections are in separated projects, as a module, and these modules contain WCF services which call each other in some cases.
The question is that if each module has own connection to the database does it affect on performance or I should find a way to share the connection per each web request.
Imaging that in some scenarios four or five modules are involved, it means that each request would create more than five connection that immediately would be closed after execution.
I think the best practice is using connection pool.
think about what will happen if two modules going to participate in a transaction.
if you use separate connections in one transaction you will end up with distributed transaction that of course will suffer the performance.
Imagine a Java ecosystem where three separate Spring web application is running in separate JVMs and on separate machines (no application server involved, just simple servlet containers). Two of these applications are using their own database accessed using JPA. Now the third application (a coordinator) provides services to the outside world and some service function executes remote operations which requires participation from the other two apps in a transactional manner, which means that if one of the applications fails to do the data manipulation in the database, the other should be rolled back as well. The problem is: how can this be achieved using Spring?
Currently we are using REST to communicate between the applications. Clearly this cannot support transactions, even though there are efforts to make this happen.
I've found JTA which is capable of organizing global transactions. JTA involves creating XAResource instances which are participating in the globally managed transactions. If i understood correctly, these XAResource instance can reside on separate JVMs. Initialization, commit and rollback of resources happens via JMS communication which means it requires a message broker to transfer messages between participants. There are various JTA implementation exists, I've found Atomikos which seems to be the most used.
Now the thing i don't see is how this all comes up if i have a Spring application on each application side. I've not found any example projects yet which is doing JTA over a network. Also i don't undertstand what are XAResources representing. If i use JPA, and say i have a Account object in an application which stores a user's balance, and i have to decrease the balance from the coordinator, should i create an XAResource implementation which allows decreasing the balance? Or XAResource is implemented by a lower level thing like the JDBC driver or Spring Data JPA? In the latter case how can i provide high level CRUD operations for the transaction coordinator.
XAResource is a lower level API. You could write your own for the coordinator, but I think that isn't necesary. Instead, leverage JMS + JTA on the coordinator and JTA on the app servers.
In the normal case, you'd have this:
Coordinator receives request and starts JTA transaction
Coordinator calls app 1 over JMS
App 1 receives JMS message
App 1 calls DB 1 using JTA
Coordinator calls app 2 over JMS
App 2 receives JVM message
App 2 calls DB 2 using JTA
Coordinator commits tx
Note that JTA is used for all the transactions - this will be a global TX that's shared across all the servers. If any of these steps fail, then they will be rolled back.
Spring should be able to make this transparent once you get it all set up. Just make sure your DAO & service calls are transactional. Atomikos will need to be configured so that each server uses the same JTA tx manager.
REST does support transactions now, via the Atomikos TCC implementation available from www.atomikos.com - it is the actual implementation of the design in the talk you are referring to...
HTH
This answer is a summary of more detailed post:
How would you tune Distributed ( XA ) transaction for performance?
This diagram depicts the comunication flow in between the transaction coordinator and the transatcion participant.
In your particular case your transaction coordinator will be Atomikos or Bitornix or any other provider. Everythin in the flow belo the end(XID) is completlyinvisible for a developer and is performed only by the transaction coordinator. The first points start,end are within the application scope.
Now based on your question. You can not have distributed transaction in between applications. You can have distributed transaction in between infrastructure that supports them. If you want to have transactions in between application components separated by the network you better use Transaction Compensation and this is a whole different topic.
What you can do with Distributed transaction is from one application, one service, one component whatever enlist multiple databases or resources supporting XA and then execute some transaction.
I see the post below stating Atomikos having some sort of infrastructure supporting XA for REST. In general the classic algorithm for transaction compensation such as Try Cancel Conirm attern is very close to a 2 phase commit protocol. Without looking into the details my guess is that they gave implemented something around this lines.
I have a simple Web based application scenario,Sending a request and get response from Database.Response would be having very large number of rows say around 10,000 to 20,000 of records at a time.
I have designed for Audit Logging for all transaction.i.e.Inserting into database for all such responses.say 10,000 to 20,000 rows at a time.
As,Inserting into the table is just for auditing purpose.Can I have some way to separate Auditing and Logging from Normal response ? Some way to differentiate them ?
Any help on design would be highly appreciable.
Thanks in Advance.
In general, it's a bad idea for a web application to do too much work in a synchronous web request. Web servers (and web application servers) are designed to serve lots of concurrent requests, but on the assumption that each request will take just milliseconds to execute. Different servers have different threading strategies, but as soon as you have long-running requests, you're likely to encounter an overhead due to thread management, and you can then very quickly find your web server slowing down to the point of appearing broken.
Reading or writing 10s of thousands of rows in a single web request is almost certainly a bad idea. You probably want to design your application to use asynchronous worker queues. There are several solutions for this; in the Java ecosystem, you could check out vert.x
In these asynchronous models, auditing is straightforward - your auditor subscribes to the same message queue as the "write to database" listener.
Checkout log4j2 for seperating auditing and logging.
This is easily done by having two appenders in the log4j2.xml itself.
For reference visit:
https://logging.apache.org/log4j/2.x/manual/appenders.html
I have a very simple scenario involving a database and a JMS in an application server (Glassfish). The scenario is dead simple:
1. an EJB inserts a row in the database and sends a message.
2. when the message is delivered with an MDB, the row is read and updated.
The problem is that sometimes the message is delivered before the insert has been committed in the database. This is actually understandable if we consider the 2 phase commit protocol:
1. prepare JMS
2. prepare database
3. commit JMS
4. ( tiny little gap where message can be delivered before insert has been committed)
5. commit database
I've discussed this problem with others, but the answer was always: "Strange, it should work out of the box".
My questions are then:
How could it work out-of-the box?
My scenario sounds fairly simple, why isn't there more people with similar troubles?
Am I doing something wrong? Is there a way to solve this issue correctly?
Here are a bit more details about my understanding of the problem:
This timing issue exist only if the participant are treated in this order. If the 2PC treats the participants in the reverse order (database first then message broker) that should be fine. The problem was randomly happening but completely reproducible.
I found no way to control the order of the participants in the distributed transactions in the JTA, JCA and JPA specifications neither in the Glassfish documentation. We could assume they will be enlisted in the distributed transaction according to the order when they are used, but with an ORM such as JPA, it's difficult to know when the data are flushed and when the database connection is really used. Any idea?
You are experiencing the classic XA 2-PC race condition. It does happen in production environments.
There are 3 things coming to my mind.
Last agent optimization where JDBC is the non-XA resource.(Lose recovery semantics)
Have JMS Time-To-Deliver. (Deliberately Lose real time)
Build retries into JDBC code. (Least effect on functionality)
Weblogic has this LLR optimization avoids this problem and gives you all XA guarantees.