Eventual consistency with both database and message queue records - database

I have an application where I need to store some data in a database (mysql for instance) and then publish some data in a message queue. My problem is: If the application crashes after the storage in the database, my data will never be written in the message queue and then be lost (thus eventual consistency of my system will not be guaranted).
How can I solve this problem ?

I have an application where I need to store some data in a database (mysql for instance) and then publish some data in a message queue. My problem is: If the application crashes after the storage in the database, my data will never be written in the message queue and then be lost (thus eventual consistency of my system will not be guaranted). How can I solve this problem ?
In this particular case, the answer is to load the queue data from the database.
That is, you write the messages that need to be queued to the database, in the same transaction that you use to write the data. Then, asynchronously, you read that data from the database, and write it to the queue.
See Reliable Messaging without Distributed Transactions, by Udi Dahan.
If the application crashes, recovery is simple -- during restart, you query the database for all unacknowledged messages, and send them again.
Note that this design really expects the consumers of the messages to be designed for at least once delivery.

I am assuming that you have a loss-less message queue, where once you get a confirmation for writing data, the queue is guaranteed to have the record.
Basically, you need a loop with a transaction that can roll back or a status in the database. The pseudo code for a transaction is:
Begin transaction
Insert into database
Write to message queue
When message queue confirms, commit transaction
Personally, I would probably do this with a status:
Insert into database with a status of "pending" (or something like that)
Write to message queue
When message confirms, change status to "committed" (or something like that)
In the case of recovery from failure, you may need to check the message queue to see if any "pending" records were actually written to the queue.

I'm afraid that answers (VoiceOfUnreason, Udi Dahan) just sweep the problem under the carpet. The problem under carpet is: How the movement of data from database to queue should be designed so that the message will be posted just once (without XA). If you solve this, then you can easily extend that concept by any additional business logic.
CAP theorem tells you the limits clearly.
XA transactions is not 100% bullet proof solution, but seems to me best of all others that I have seen.

Adding to what #Gordon Linoff said, assuming durable messaging (something like MSMQ?) the method/handler is going to be transactional, so if it's all successful, the message will be written to the queue and the data to your view model, if it fails, all will fail...
To mitigate the ID issue you will need to use GUIDs instead of DB generated keys (if you are using messaging you will need to remove your referential integrity anyway and introduce GUIDS as keys).
One more suggestion, don't update the database, but inset only/upsert (the pending row and then the completed row) and have the reader do the projection of the data based on the latest row (for example)

Writing message as part of transaction is a good idea but it has multiple drawbacks like
If your
a. database/language does not support transaction
b. transaction are time taking operation
c. you can not afford to wait for queue response while responding to your service call.
d. If your database is already under stress, writing message will exacerbate the impact of higher workload.
the best practice is to use Database Streams. Most of the modern databases support streams(Dynamodb, mongodb, orcale etc.). You have consumer of database stream running which reads from database stream and write to queue or invalidate cache, add to search indexer etc. Once all of them are successful you mark the stream item as processed.
Pros of this approach
it will work in the case of multi-region deployment where there is a regional failure. (you should read from regional stream and hydrate all the regional data stores.)
No Overhead of writing more records or performance bottle necks of queues.
You can use this pattern for other data sources as well like caching, queuing, searching.
Cons
You may need to call multiple services to construct appropriate message.
One database stream might not be sufficient to construct appropriate message.
ensure the reliability of your streams, like redis stream is not reliable
NOTE this approach also does not guarantee exactly once semantics. The consumer logic should be idempotent and should be able to handle duplicate message

Related

Data consistency across multiple microservices, which duplicate data

I am currently trying to get into microservices architecture, and I came across Data consistency issue. I've read, that duplicating data between several microservices considered a good idea, because it makes each service more independent.
However, I can't figure out what to do in the following case to provide consistency:
I have a Customer service which has a RegisterCustomer method.
When I register a customer, I want to send a message via RabbitMQ, so other services can pick up this information and store in its DB.
My code looks something like this:
...
_dbContext.Add(customer);
CustomerRegistered e = Mapper.Map<CustomerRegistered>(customer);
await _messagePublisher.PublishMessageAsync(e.MessageType, e, "");
//!!app crashes
_dbContext.SaveChanges();
...
So I would like to know, how can I handle such case, when application sends the message, but is unable to save data itself? Of course, I could swap DbContextSave and PublishMessage methods, but trouble is still there. Is there something wrong with my data storing approach?
Yes. You are doing dual persistence - persistence in DB and durable queue. If one succeeds and other fails, you'd always be in trouble. There are a few ways to handle this:
Persist in DB and then do Change Data Capture (CDC) such that the data from the DB Write Ahead Log (WAL) is used to create a materialized view in the second service DB using real time streaming
Persist in a durable queue and a cache. Using real time streaming persist the data in both the services. Read data from cache if the data is available in cache, otherwise read from DB. This will allow read after write. Even if write to cache fails in worst case, within seconds the data will be in DB through streaming
NServiceBus does support durable distributed transaction in many scenarios vs. RMQ.Maybe you can look into using that feature to ensure that both the contexts are saved or rolled back together in case of failures if you can use NServiceBus instead of RMQ.
I think the solution you're looking for is outbox pattern,
there is an event related database table in the same database as your business data,
this allows them to be committed in the same database transaction,
and then a background worker loop push the event to mq

Akka Actors: Handling DB Failures Without Losing Data

Scenario
The DB for an application has gone down. This results in any actor responsible for committing important data to the DB failing to get a connection
Preferred Behaviour
The important data is written to the db when it comes back up sometime in the future.
Current Implementation
The actor catches the DBException, wraps the data in a DBWriteFailed case class, and sends the message to its supervisor. The supervisor then schedules another write for sometime in the future (e.g. 1 minute) using system.scheduler.scheduleOnce(...) so that we don't spin in circles too much while waiting for the DB to come back up.
This implementation certainly works but I feel there might be a better way.
The protocol gets a bit messier when the committing actor has to respond to the original sender after a successful commit.
The regular flow of messages to the committing actor is not throttled in any way and the actor will happily process the new messages, likely failing to connect to the DB for each and every one of them.
If messages get caught in this retry loop for too long, the mailboxes of the committing actors will start to balloon. It is important that this data be committed, but none of it matters if the application crawls to a halt or crashes due to excessive memory usage.
I am an akka novice and I am largely inexperienced when it comes to supervisor strategies, but I feel as though I may be able to leverage one of those to handle some of this retry logic.
Is there a common approach in akka for solving a problem like this? Am I on the right track or should I be heading in a different direction?
Any help is appreciated.
You can use Akka Circuit Breaker to reduce connection attempts. Instead of using the scheduler as retry queue I would use a buffer (with max size limit) inside the actor and retry those when circuit breaker becomes closed again (onClose callback should send message to self actor). An alternative could be to combine the circuit breaker with a stashing mailbox.
If you're planning to implement full failover in your app
Don't.
Do not bubble database failover responsibility up into the app layer. As far as your app is concerned, the database should just be up and ready to accept reads and writes.
If your database goes down often, spend time making your database more robust (there's a multitude of resources on the web already for this: search the web for terms like 'replication', 'high availability', 'load-balancing' and 'clustering', and learn from the war stories of others at highscalability.com). It all really depends on what the cause of your DB outages are (e.g. I once maxed out the NIC on the DB master, and "fixed" the problem intermittently by enabling GZIP on the wire).
You'll be glad you adhered to a separation of concerns if you go down this route.
If you're planning to implement the odd sprinkling of retry logic and handling DB brown-outs
If you're not expecting your app to become a replacement database, then Patrik's answer is the best way to go.

In AWS or Azure cloud architecture, why would I choose message queues over polling a database?

In the past I have used message queues to handle spikes in demand. This system works fine, except for logging purposes. I write successfully processed messages to a database for reporting and logging. This makes me wonder why I don't just write the message into a database from the beginning, and have my "worker roles" poll the database, rather than the message queue.
I'm guessing this is not the best design because as the database grows, polling a huge database just to look for one "unchecked" record to process will become very slow, whereas a message queue just gives me one if I ask for it instantaneously.
Am I missing something? Are there other reasons to choose a message queue over polling a database? I would love to offer users the ability to see what has yet to be processed (floating in the queue) but that operation takes much longer than running a query on the database, so it seems to be a tradeoff.
Thanks for any input.
One other reason that springs to mind is blocking/locking. Typically, if you just poll a database looking for work, it'll work reasonably well as long as you have only one worker digesting the messages. However, if you want to horizontally scale out, and throw more workers at the problem, you'll typically end up causing lock escalations as you change the work messages in your database based "queue" from "needs to get run" to "ran successfully" or whatever.
Using the message queue takes care of this trickiness for you, as all the thread safety and locking/blocking is out of the way.

Message Queue or DataBase insert and select

I am designing an application and I have two ideas in mind (below). I have a process that collects data appx. 30 KB and this data will be collected every 5 minutes and needs to be updated on client (web side-- 100 users at any given time). Information collected does not need to be stored for future usage.
Options:
I can get data and insert into database every 5 minutes. And then client call will be made to DB and retrieve data and update UI.
Collect data and put it into Topic or Queue. Now multiple clients (consumers) can go to Queue and obtain data.
I am looking for option 2 as better solution because it is faster (no DB calls) and no redundancy of storage.
Can anyone suggest which would be ideal solution and why ?
I don't really understand the difference. The data has to be temporarily stored somewhere until the next update, right.
But all users can see it, not just the first person to get there, right? So a queue is not really an appropriate data structure from my interpretation of your system.
Whether the data is written to something persistent like a database or something less persistent like part of the web server or application server may be relevant here.
Also, you have tagged this as real-time, but I don't see how the web-clients are getting updates real-time without some kind of push/long-pull or whatever.
Seems to me that you need to use a queue and publisher/subscriber pattern.
This is an article about RabitMQ and Publish/Subscribe pattern.
I can get data and insert into database every 5 minutes. And then client call will be made to DB and retrieve data and update UI.
You can program your application to be event oriented. For ie, raise domain events and publish your message for your subscribers.
When you use a queue, the subscriber will dequeue the message addressed to him and, ofc, obeying the order (FIFO). In addition, there will be a guarantee of delivery, different from a database where the record can be delete, and yet not every 'subscriber' have gotten the message.
The pitfalls of using the database to accomplish this is:
Creation of indexes makes querying faster, but inserts slower;
Will have to control the delivery guarantee for every subscriber;
You'll need TTL (Time to Live) strategy for the records purge (considering delivery guarantee);

sql service broker functionality question

I'm a beginning web developer sitting on an ambitious web application project.
So after having done some research, I found out about SQL Service Broker. It seems like something I could use, but I'm not sure. Since learning it requires someone to put in lots of time, I wanted to be sure that it would fit my needs.
I need to implement a system where website users can submit text to the website. This stream of messages has to be redundant and dealt with in a FIFO way, with on the other end of the stream another group of users dealing with the messages.
Now, a message that is being read by one of this last group of users, should be locked so that no-one else can read it at the same time. The user can then decide to handle the message or not. Only if he decides to deal with the message can it be deleted from the queue. If he decides that he doesn't want to deal with the message, the message should be put back in the queue (at the end of the queue, or at least with the highest priority), so that another user can read it and decide.
Is this something I would be able to implement with SQL Service Broker? Am I on the wrong track?
Thank you!
IMO, the best use of Service Broker is for connecting to independent Application in a loosely coupled way. What I mean by that is that systems tied in this way can communicate through a set of mutually agreed message types. This in contrast to one application manipulating directly the other's database, for example.
From what you've said, I would implement it as a simple table, for example: Create a message table with an identity PK, an Allocation flag and your custom columns. Whenever an operator wants to fetch the last message, get the lowest PK value for which Allocation = 'N' and update Allocation to 'Y'. This in a single transaction.
When/if the operation decides to return the message to queue, just set its AllocationFlag to 'N' and its back.
This is just an example. The database in this case is providing you consistency, heavy load performance, etc.
Behind the screens all data you submit to SSB is stored and manipulated as tables, so there is no reason for it to be necessarily faster than a database solution .

Resources