I have a very simple scenario involving a database and a JMS in an application server (Glassfish). The scenario is dead simple:
1. an EJB inserts a row in the database and sends a message.
2. when the message is delivered with an MDB, the row is read and updated.
The problem is that sometimes the message is delivered before the insert has been committed in the database. This is actually understandable if we consider the 2 phase commit protocol:
1. prepare JMS
2. prepare database
3. commit JMS
4. ( tiny little gap where message can be delivered before insert has been committed)
5. commit database
I've discussed this problem with others, but the answer was always: "Strange, it should work out of the box".
My questions are then:
How could it work out-of-the box?
My scenario sounds fairly simple, why isn't there more people with similar troubles?
Am I doing something wrong? Is there a way to solve this issue correctly?
Here are a bit more details about my understanding of the problem:
This timing issue exist only if the participant are treated in this order. If the 2PC treats the participants in the reverse order (database first then message broker) that should be fine. The problem was randomly happening but completely reproducible.
I found no way to control the order of the participants in the distributed transactions in the JTA, JCA and JPA specifications neither in the Glassfish documentation. We could assume they will be enlisted in the distributed transaction according to the order when they are used, but with an ORM such as JPA, it's difficult to know when the data are flushed and when the database connection is really used. Any idea?
You are experiencing the classic XA 2-PC race condition. It does happen in production environments.
There are 3 things coming to my mind.
Last agent optimization where JDBC is the non-XA resource.(Lose recovery semantics)
Have JMS Time-To-Deliver. (Deliberately Lose real time)
Build retries into JDBC code. (Least effect on functionality)
Weblogic has this LLR optimization avoids this problem and gives you all XA guarantees.
Related
I understand, in a fuzzy sort of way, how regular ACID transactions work. You perform some work on a database in such a way that the work is not confirmed until some kind of commit flag is set. The commit part is based on some underlying assumption (like a single disk block write is atomic). In the event of a catastrophic error, you can just clear out the uncommitted data in the recovery phase.
How do distributed transactions work? In some of the MS documentation I have read that you can somehow perform a transaction across databases and filesystems (among other things).
This technology could be (and probably is) used for installers, where you want the program to be fully installed or fully absent. You simply begin a transaction at the start of the installer. Next you could connect to the registry and filesystem, making the changes that define the installation. When the job is done, simply commit, or rollback if the installation fails for some reason. The registry and filesystem are automatically cleaned for you by this magical distributed transaction coordinator.
How is it possible that two disparate systems can be transacted upon in this fashion? It seems to me that it is always possible to leave the system in an inconsistent state, where the filesystem has committed its changes and the registry has not. I think in MSDTC it is even possible to perform a transaction across the network.
I have read http://blogs.msdn.com/florinlazar/archive/2004/03/04/84199.aspx, but it feels like only the beginning of the explanation, and that step 4 should be expanded considerably.
Edit: From what I gather on http://en.wikipedia.org/wiki/Distributed_transaction, it can be accomplished by a two-phase commit (http://en.wikipedia.org/wiki/Two-phase_commit). After reading this, I'm still not understanding the method 100%, it seems like there is a lot of room for error between the steps.
About "step 4":
The transaction manager coordinates
with the resource managers to ensure
that all succeed to do the requested
work or none of the work if done, thus
maintaining the ACID properties.
This of course requires all participants to provide the proper interfaces and (error-free) implementations. The interface looks like vaguely this:
public interface ITransactionParticipant {
bool WouldCommitWork();
void Commit();
void Rollback();
}
The Transaction manager at commit-time queries all participants whether they are willing to commit the transaction. The participants may only assert this if they are able to commit this transaction under all allowable error conditions (validation, system errors, etc). After all participants have asserted the ability to commit the transaction, the manager sends the Commit() message to all participants. If any participant instead raises an error or times out, the whole transaction aborts and individual members are rolled back.
This protocol requires participants to have recorded their whole transaction content before asserting their ability to commit. Of course this has to be in a special local transaction log structure to be able to recover from various kinds of failures.
I have an application where I need to store some data in a database (mysql for instance) and then publish some data in a message queue. My problem is: If the application crashes after the storage in the database, my data will never be written in the message queue and then be lost (thus eventual consistency of my system will not be guaranted).
How can I solve this problem ?
I have an application where I need to store some data in a database (mysql for instance) and then publish some data in a message queue. My problem is: If the application crashes after the storage in the database, my data will never be written in the message queue and then be lost (thus eventual consistency of my system will not be guaranted). How can I solve this problem ?
In this particular case, the answer is to load the queue data from the database.
That is, you write the messages that need to be queued to the database, in the same transaction that you use to write the data. Then, asynchronously, you read that data from the database, and write it to the queue.
See Reliable Messaging without Distributed Transactions, by Udi Dahan.
If the application crashes, recovery is simple -- during restart, you query the database for all unacknowledged messages, and send them again.
Note that this design really expects the consumers of the messages to be designed for at least once delivery.
I am assuming that you have a loss-less message queue, where once you get a confirmation for writing data, the queue is guaranteed to have the record.
Basically, you need a loop with a transaction that can roll back or a status in the database. The pseudo code for a transaction is:
Begin transaction
Insert into database
Write to message queue
When message queue confirms, commit transaction
Personally, I would probably do this with a status:
Insert into database with a status of "pending" (or something like that)
Write to message queue
When message confirms, change status to "committed" (or something like that)
In the case of recovery from failure, you may need to check the message queue to see if any "pending" records were actually written to the queue.
I'm afraid that answers (VoiceOfUnreason, Udi Dahan) just sweep the problem under the carpet. The problem under carpet is: How the movement of data from database to queue should be designed so that the message will be posted just once (without XA). If you solve this, then you can easily extend that concept by any additional business logic.
CAP theorem tells you the limits clearly.
XA transactions is not 100% bullet proof solution, but seems to me best of all others that I have seen.
Adding to what #Gordon Linoff said, assuming durable messaging (something like MSMQ?) the method/handler is going to be transactional, so if it's all successful, the message will be written to the queue and the data to your view model, if it fails, all will fail...
To mitigate the ID issue you will need to use GUIDs instead of DB generated keys (if you are using messaging you will need to remove your referential integrity anyway and introduce GUIDS as keys).
One more suggestion, don't update the database, but inset only/upsert (the pending row and then the completed row) and have the reader do the projection of the data based on the latest row (for example)
Writing message as part of transaction is a good idea but it has multiple drawbacks like
If your
a. database/language does not support transaction
b. transaction are time taking operation
c. you can not afford to wait for queue response while responding to your service call.
d. If your database is already under stress, writing message will exacerbate the impact of higher workload.
the best practice is to use Database Streams. Most of the modern databases support streams(Dynamodb, mongodb, orcale etc.). You have consumer of database stream running which reads from database stream and write to queue or invalidate cache, add to search indexer etc. Once all of them are successful you mark the stream item as processed.
Pros of this approach
it will work in the case of multi-region deployment where there is a regional failure. (you should read from regional stream and hydrate all the regional data stores.)
No Overhead of writing more records or performance bottle necks of queues.
You can use this pattern for other data sources as well like caching, queuing, searching.
Cons
You may need to call multiple services to construct appropriate message.
One database stream might not be sufficient to construct appropriate message.
ensure the reliability of your streams, like redis stream is not reliable
NOTE this approach also does not guarantee exactly once semantics. The consumer logic should be idempotent and should be able to handle duplicate message
All transaction managers (Atomikos, Bitronix, IBM WebSphere TM etc) save some "transaction logs" into 'tranlogs' folder to file system.
When something terrible happens and server gets down sometimes tranlogs become broken.
They require some manual recovery procedure.
I've been told that by simply clearing broken tranlogs folder I risk to have an inconsistent state of resources that participated in transactions.
As a "dumb" developer I feel more comfortable with simple concepts. I want to think that distributed transaction management should be alike the regular transaction management:
If something went wrong at any party (network, app error, timeout) - I expect the whole multi-resource transaction not to be committed in any part of it. All leftovers should be cleaned up sooner or later automatically.
If transaction managers fails (file system fault, power supply fault) - I expect all the transactions under this TM to be rollbacked (apparently, at DB timeout level).
File storage for tranlogs is optional if I don't want to have any automatic TX recovery (whatever it would mean).
Questions
Why can't I think like this? What's so complicated about 2PC?
What are the exact risks when I clear broken tranlogs?
If I am wrong and I really need all the mess with 2PC file system state. Don't you feel sick about the fact that TX manager can actually break storage state in an easy and ugly manner?
When I was first confronted with 2 phase commit in real life in 1994 (initially on a larger Oracle7 environment), I had a similar initial reaction. What a bloody shame that it is not generally possible to make it simple. But looking back at algorithm books of university, it become clear that there is no general solution for 2PC.
See for instance how to come to consensus in a distributed environment
Of course, there are many specific cases where a 2PC commit of a transaction can be resolved more easy to either complete or roll back completely and with less impact. But the general problem stays and can not be solved.
In this case, a transaction manager has to decide at some time what to do; a transaction can not remain open forever. Therefor, as an ultimate solution they will always need to have go back to their own transaction logs, since one or more of the other parties may not be able to reliably communicate status now and in the near future. Some transaction managers might be more advanced and know how to resolve some cases more easily, but the need for an ultimate fallback stays.
I am sorry for you. Fixing it generally seems to be identical to "Falsity implies anything" in binary logic.
Summarizing
On Why can't I think like this? and What's so complicated about 2PC: See above. This algorithmetic problem can't be solved universally.
On What are the exact risks when I clear broken tranlogs?: the transaction manager has some database backing it. Deleting translogs is the same problem in general relational database software; you loose information on the transactions in process. Some db platforms can still have somewhat or largely integer files. For background and some database theory, see Wikipedia.
On Don't you feel sick about the fact that TX manager can actually break storage state in an easy and ugly manner?: yes, sometimes when I have to get a lot of work done by the team, I really hate it. But well, it keeps me having a job :-)
Addition: to 2PC or not
From your addition I understand that you are thinking whether or not to include 2PC in your projects.
In my opinion, your mileage may vary. Our company has as policy for 2PC: avoid it whenever possible. However, in some environments and especially with legacy systems and complex environments such a found in banking you can not work around it. The customer requires it and they may be not willing to allow you to perform a major change in other infrastructural components.
When you must do 2PC: do it well. I like a clean architecture of the software and infrastructure, and something that is so simple that even 5 years from now it is clear how it works.
For all other cases, we stay away from two phase commit. We have our own framework (Invantive Producer) from client, to application server to database backend. In this framework we have chosen to sacrifice elements of ACID when normally working in a distributed environment. The application developer must take care himself of for instance atomicity. Often that is possible with little effort or even doesn't require thinking about. For instance, all software must be safe for restart. Even with atomicity of transactions this requires some thinking to do it well in a massive multi user environment (for instance locking issues).
In general this stupid approach is very easy to understand and maintain. In cases where we have been required to do two phase commit, we have been able to just replace some plug-ins on the framework and make some changes to client-side code.
So my advice would be:
Try to avoid 2PC.
But encapsulate your transaction logic nicely.
Allowing to do 2PC without a complete rebuild, but only changing things where needed.
I hope this helps you. If you can tell me more about your typical environments (size in #tables, size in GB persistent data, size in #concurrent users, typical transaction mgmt software and platform) may be i can make some additions or improvements.
Addition: Email and avoiding message loss in 2PC
Regarding whether suggesting DB combining with JMS: No, combining DB with JMS is normally of little use; it will itself already have some db, therefor the original question on transaction logs.
Regarding your business case: I understand that per event an email is sent from a template and that the outgoing mail is registered as an event in the database.
This is a hard nut to crack; I've been enjoying doing security audits and one of the easiest security issues to score was checking use of email.
Email - besides not being confidential and tampersafe in most situations like a postcard - has no guarantees for delivery and/or reading without additional measures. For instance, even when email is delivered directly between your mail transfer agent and the recipient, data loss can occur without the transaction monitor being informed. That even gets worse when multiple hops are involved. For instance, each MTA has it's own queueing mechanism on which a "bomb can be dropped" leading to data loss. But you can also think of spam measures, bad configuration, mail loops, pressing delete file by accident, etc. Even when you can register the sending of the email without any loss of transaction information using 2PC, this gives absolutely no clue on whether the email will arrive at all or even make it across the first hop.
The company I work for sells a large software package for project-driven businesses. This package has an integrated queueing mechanism, which also handles email events. Typically combined in most implementation with Exchange nowadays. A few months we've had a nice problem: transaction started, opened mail channel, mail delivered to Exchange as MTA, register that mail was handled... transaction aborted, since Oracle tablespace full. On the next run, the mail was delivered again to Exchange, again abort, etc. The algorithm has been enhanced now, but from this simple example you can see that you need all endpoints to cooperate in your 2PC, even when some of the endpoints are far away in an organisation receiving and displaying your email.
If you need measures to ensure that an email is delivered or read, you will need to supplement it by additional measures. Please pick one of application controls, user controls and process controls from literature.
In app engine transactions documentation I have found the following note:
Note: If your app receives an exception when submitting a transaction,
it does not always mean that the transaction failed. You can receive
Timeout, TransactionFailedError, or InternalError exceptions in cases
where transactions have been committed and eventually will be applied
successfully. Whenever possible, make your Datastore transactions
idempotent so that if you repeat a transaction, the end result will be
the same.
This is quite general information and I wasn't able to find more details. I have the following questions regarding this issue:
Does it affect NDB transations? NDB documentation doesn't
mention it, but I suppose that this behavior is inherited
What can cause this type of situation?
How often can it happen?
Can I prevent it, or decrease probability?
Are transactional tasks enqueued in this situation?
Is this situation a bug, which will be fixed in the future, or a feature, which I should just get used to?
Yes, it affects ndb too.
Potential causes include network partitions where the datastore server commits successfully but cannot communicate the result to the app.
It is rare, but cannot be prevented, and will never be fixed. It is inherent to all distributed systems.
Task queue adds are committed with the transaction by the datastore server.
After performing an insert/update/delete, is it necessary to query the database to check if the action was performed correctly?
Edit:
I accepted an answer and would like to use it to convince management.
However, the management insists that there is a possibility that an insert/update/delete request could be corrupted in transmission (but wouldn't the network checksum fail?), and that I'm supposed to check if each transaction was performed correctly. Perhaps they're hinging on the fact that the checksum of a damaged packet can collide with the original packet's checksum. I think they're stretching it too far, and in most likelihood wouldn't do it for my own projects. Nonetheless, I am just a junior programmer and have no say.
Shouldn't be. Commercial database inserts/updates/deletes (and all db transactions) follow the ACID principle.
Wiki Quote:
In computer science, ACID (atomicity,
consistency, isolation, durability) is
a set of properties that guarantee
database transactions are processed
reliably.
If you have the feeling that you need to double check the success of your transactions then the problem most likely lies elsewhere in your architecture.
This isn't necessary - if the query completes successfully then the modification has been performed - if the query fails for whatever reason then the entire action will be rolled back for the query that failed if multiple queries are executed in a batch.
Depending on the isolation level that is being used, it's wholly possible that your modification is either superceded by modifications made by another query running 'at the same time' - whether this is important is down to what you're expecting to happen in this circumstance.
You shouldn't.
You can use SQL (or your programming platforms) built in error handling mechanism to see if there was errors so you can notify user that something bad happened, but otherwise all DB transactions follow ACID (as mentioned by Paul) which means that if something in batch fails, all batch is rolled back.