What databatases support event subscriptions? - database

Which data storage systems I can use in order to implement real-time notifications through subscriptions? Be it notifications about data updates, or custom messages with payload.
Surprisingly, there is no answer to this question, at least, not in one place.

Redis: an in-memory key/value store. It can do blocking-pop from a list, and also supports PubSub. This enables you to build a queue of items: e.g. serialized strings.
Postgres: a relational database. NOTIFY supports sending named events with a payload. These can be sent from triggers that react on insert/update/delete actions.
MongoDB. A NoSQL document database. Supports Change streams that lets you subscribe to a collection0 (≈database) and receive updates about created documents.
EventStore. A store for event-sourcing. Supports subscriptions to event streams.
RethinkDB supports realtime queries: i.e. a long-running query that sends results as they arrive. Unfortunately, the project seems dead as of 2021.

Related

Is there an open source component that will subscribe to various database activity feeds and invalidate out of process caches like redis?

We are looking to implement a redis based cache for read heavy data for fronting our database as a read through cache. I would like to implement a better invalidation mechanism than just TTL or LRU based eviction to prevent stale reads as much as possible.
Several databases provide notification mechanism for database objects such as tables. For example oracle has Change Notifications and Postgresql has NOTIFY for this purpose. Is there any existing open source project/component that listens to these notifications and uses them to invalidate out of process caches like redis or memcached? I have seen several projects for doing this to in-process caches but none so far for out of process (either clustered/unclustered) caches.
Redis Labs announced their new "RedisCDC" solution at RedisConf 2021 which seamlessly migrates data from heterogeneous data sources to Redis and Redis Modules. Its configurable and extendable, so you can easily create a custom stage that invalidates Redis keys when there is an update or delete on the source side.
debezium is a component that implements the whole pipeline from using CDC from the database to publishing those changes in a format you prefer.

CQRS pattern with Event Sourcing having single database for read/write

I have a single SQL Server database for both read and write operation, and I am implementing CQRS pattern for code segregation and maintainability so that I can assign read operations to few resources in my team and write operations to other, I see using CQRS seems to be a clean approach.
Now, whenever there are Inserts/Update/Deletes happening on tables in my database, I need to send messages to other systems who need to know about the changes in my system, as my database is master data so any changes happening here needs to be projected to down stream systems so that they get the latest data and maintain it in their systems. For this purpose I may use MQ or Kafka so whenever there is change i can generate key message and put in MQ or use kafka for messaging purpose.
Until now I haven't used Event Sourcing as I thought since I don't have multiple databases for read/write, so I might not need Event Sourcing, is my assumption right that if we have single database we don't need Event Sourcing ? or Event Sourcing can play any role in utilizing MQ or Kafka, I mean if i use Event Sourcing pattern i can save data first in the master database and use Event Sourcing pattern to write the changes into MQ or use kafka to write messages I am clueless here if we can use Event Sourcing pattern for MQ or Kafka.
Do I need Event Sourcing for writing message to MQ or for using Kafka ? or it's not need at all in my case as i have only one database and I don't need to know the series of updates that happened to the system of record and all i care is about the final state of the record in my master database and then use MQ or Kafka to send about changes to downstream systems where ever there are CRUD operations so that they will have the latest changes.
is my assumption right that if we have single database we don't need Event Sourcing ?
No, that assumption is wrong.
In the CQRS "tradition", event sourcing refers to maintaining information in a persistent data structure; when new information arrives, we append that information to what we already know. In other words, it describes the pattern we use to remember information. See Fowler 2005.
When you starting talking about messaging solutions like Kafka or Rabbit, you are in a different problem space: how do you share information between systems? Well, we put the information into a message, and deliver that message from the producer to the consumer(s). And for reasons of history, that message is called an Event (see Hohpe et al).
Two different ideas, easily confused. When CQRS people talk about event sourcing, they don't mean the same thing as Kafka people talking about event sourcing.
Do I need Event Sourcing for writing message to MQ or for using Kafka ?
No - messaging still works even when you choose other remembering strategies than event sourcing.
It's perfectly reasonable to modify your model "in place", as it were, and then to also publish a message announcing that things have changed.
why do I need Event Sourcing when I have only one database and I don't care about the series of changes to the records but care only about the final state of the data,
You don't.
The two most common motivations for event sourcing is that either (a) an event history has natural alignment with the domain you are working in (for instance: accounting) or (b) a need to support temporal queries.
If you don't have those problems, then you don't need it.

SalesForce Notifications - Reliable Integration

I need to develop a system that is listening to the changes happened with SalesForce objects and transfers them to my end.
Initially I considered SalesForce Streaming API that allows exactly that - create a push topic that subscribes to objects notifications and later have a set of clients that are reading them using long polling.
However such approach doesn't guarantee durability and reliable delivery of notifications - which I am in need.
What will be the architecture allowing to implement the same functionality in reliable way?
One approach I have in mind is create a Force.com applications that uses SalesForce triggers to subscribe to notifications and later just sends them using HTTPS to the cloud or my Data Server. Will this be a valid option - or are there any better ones?
I two very good questions on salesforce.stackexchange.com covering this very topic in details:
https://salesforce.stackexchange.com/questions/16587/integrating-a-real-time-notification-application-with-salesforce
https://salesforce.stackexchange.com/questions/20600/best-approach-for-a-package-to-respond-to-dml-events-dynamically-without-object

How to keep Firebase in sync with another database

We need to keep our Firebase data in sync with other databases for full-text search (in ElasticSearch) and other kinds of queries that Firebase doesn't easily support.
This needs to be as close to real-time as possible, we can't just export a nightly dump of the Firebase JSON or anything like that, aside from the fact that this will get rather large.
My initial thought was to run a Node.js client which listens to child_changed, child_added, child_removed etc... events of all the main lists, but this could get a bit unweildy and would it be a reliable way of syncing if the client re-connects after a period of time?
My next thought was to maintain a list of "items changed" events and write to that every time an item is created/updated, similar to the Firebase work queue example. The queue could contain the full path to the data which has changed and the worker just consumes that and updates the local database accordingly.
The problem here is every bit of code which makes updates has to remember to write to this queue otherwise the two systems will get out of sync. Some proxy code shouldn't be too hard to write though.
Has anyone else done anything similar with any success?
For search queries, you can integrate directly with ElasticSearch; there is no need to sync with a secondary database. Firebase has a blog post about integrating and a lib, Flashlight, to make this quick and painless.
Another option is to use the logstash-input-firebase Logstash plugin in order to listen to changes in your Firebase real-time database(s) and forward the data in real-time to Elasticsearch using an elasticsearch output.

Real-time synchronization of database data across all the clients

What's the best strategy to keep all the clients of a database server synchronized?
The scenario involves a database server and a dynamic number of clients that connect to it, viewing and modifying the data.
I need real-time synchronization of the data across all the clients - if data is added, deleted, or updated, I want all the clients to see the changes in real-time without putting too much strain on the database engine by continuous polling for changes in tables with a couple of million rows.
Now I am using a Firebird database server, but I'm willing to adopt the best technology for the job, so I want to know if there is any kind of already existing framework for this kind of scenario, what database engine does it use and what does it involve?
Firebird has a feature called EVENT that you may be able to use to notify clients of changes to the database. The idea is that when data in a table is changed, a trigger posts an event. Firebird takes care of notifying all clients who have registered an interest in the event by name. Once notified, each client is responsible for refreshing its own data by querying the database.
The client can't get info from the event about the new or old values. This is by design, because there's no way to resolve this with transaction isolation. Nor can your client register for events using wildcards. So you have to design your server-to-client notification pretty broadly, and let the client update to see what exactly changed.
See http://www.firebirdsql.org/doc/whitepapers/events_paper.pdf
You don't mention what client platform or language you're using, so I can't advise on the specific API you would use. I suggest you google for instance "firebird event java" or "firebird event php" or similar, based on the language you're using.
Since you say in a comment that you're using WPF, here's a link to a code sample of some .NET application code registering for notification of an event:
http://www.firebirdsql.org/index.php?op=devel&sub=netprovider&id=examples#3
Re your comment: Yes, the Firebird event mechanism is limited in its ability to carry information. This is necessary because any information it might carry could be canceled or rolled back. For instance if a trigger posts an event but then the operation that spawned the trigger violates a constraint, canceling the operation but not the event. So events can only be a kind of "hint" that something of interest may have happened. The other clients need to refresh their data at that time, but they aren't told what to look for. This is at least better than polling.
So you're basically describing a publish/subscribe mechanism -- a message queue. I'm not sure I'd use an RDBMS to implement a message queue. It can be done, but you're basically reinventing the wheel.
Here are a few message queue products that are well-regarded:
Microsoft MSMQ (seems to be part of Windows Professional and Server editions)
RabbitMQ (free open-source)
Apache ActiveMQ (free open-source)
IBM WebSphere MQ (probably overkill in your case)
This means that when one client modifies data in a way that others may need to know about, that client also has to post a message to the message queue. When consumer clients see the message they're interested in, they know to refresh their copy of some data.
SQL Server 2005 and higher support notification based data source caching expiry.

Resources