Multiple Topics Versus SingleTopic in Google PubSub - google-cloud-pubsub

While designing an event delivery system with 10 different event types,I see two options to model topics on Google PubSub. Either each of the event type can be queued in a different topic or all the 10 events can be queued in the same topic. Each of the events will have different subscribers and it's possible to queue the events in the same topic but filter messages on the subscriber side. I am looking for a very high throughput with minimal latency, should I go for a single topic or multiple topics?

Filtering on the subscriber side isn't a good idea: you will use more resource (and cost more money) and you will inject latency (if the message is received by the wrong subscriber, it is rejected,...)
So, there is 2 solutions:
Create as many topics as you have type of event. But it's at the publisher to perform the filter and to publish in the correct topic. In addition, if you have new event types, you need to create a new topic and condition to deliver in it. Not ideal
Create only one topic and to filter at the subscriptions level (not the subscribers level as proposed in your question). For this, you need to add the event type as PubSub message attribute and then you can create the subscriptions that you want on the correct attribute filtering rule. The publishers don't know how the message are consumed and in case of new event type, simply create a new subscription with the correct filter.

Related

CQRS pattern with Event Sourcing having single database for read/write

I have a single SQL Server database for both read and write operation, and I am implementing CQRS pattern for code segregation and maintainability so that I can assign read operations to few resources in my team and write operations to other, I see using CQRS seems to be a clean approach.
Now, whenever there are Inserts/Update/Deletes happening on tables in my database, I need to send messages to other systems who need to know about the changes in my system, as my database is master data so any changes happening here needs to be projected to down stream systems so that they get the latest data and maintain it in their systems. For this purpose I may use MQ or Kafka so whenever there is change i can generate key message and put in MQ or use kafka for messaging purpose.
Until now I haven't used Event Sourcing as I thought since I don't have multiple databases for read/write, so I might not need Event Sourcing, is my assumption right that if we have single database we don't need Event Sourcing ? or Event Sourcing can play any role in utilizing MQ or Kafka, I mean if i use Event Sourcing pattern i can save data first in the master database and use Event Sourcing pattern to write the changes into MQ or use kafka to write messages I am clueless here if we can use Event Sourcing pattern for MQ or Kafka.
Do I need Event Sourcing for writing message to MQ or for using Kafka ? or it's not need at all in my case as i have only one database and I don't need to know the series of updates that happened to the system of record and all i care is about the final state of the record in my master database and then use MQ or Kafka to send about changes to downstream systems where ever there are CRUD operations so that they will have the latest changes.
is my assumption right that if we have single database we don't need Event Sourcing ?
No, that assumption is wrong.
In the CQRS "tradition", event sourcing refers to maintaining information in a persistent data structure; when new information arrives, we append that information to what we already know. In other words, it describes the pattern we use to remember information. See Fowler 2005.
When you starting talking about messaging solutions like Kafka or Rabbit, you are in a different problem space: how do you share information between systems? Well, we put the information into a message, and deliver that message from the producer to the consumer(s). And for reasons of history, that message is called an Event (see Hohpe et al).
Two different ideas, easily confused. When CQRS people talk about event sourcing, they don't mean the same thing as Kafka people talking about event sourcing.
Do I need Event Sourcing for writing message to MQ or for using Kafka ?
No - messaging still works even when you choose other remembering strategies than event sourcing.
It's perfectly reasonable to modify your model "in place", as it were, and then to also publish a message announcing that things have changed.
why do I need Event Sourcing when I have only one database and I don't care about the series of changes to the records but care only about the final state of the data,
You don't.
The two most common motivations for event sourcing is that either (a) an event history has natural alignment with the domain you are working in (for instance: accounting) or (b) a need to support temporal queries.
If you don't have those problems, then you don't need it.

How do I get notified of platform events in SFCC?

What is the typical way of handling events such as new customer registered, cart updated, order posted from a cartridge in SFCC B2C?
Here I can see that some resources 'support server-side customization' by proving hooks, but the number of resources support customization is small and there are no hooks for 'customer registered' or 'cart updated' events there.
Vitaly, I assume you are speaking specifically about the OCAPI interface and not the traditional Demandware Script storefront interface. However, I will answer for both contexts. There is not a single interface that would grant you the ability to know when an event occurs. Furthermore, there are multiple interfaces that can trigger such events:
Open Commerce API (OCAPI)
If you wish to listen to and/or notify an external service of an event that is triggered using this interface, you must use the appropriate hook for the resource for which you want to track the creation or modification. This hook is written in Demandware Script (ECMAScript 5 + custom extensions)
Storefront Interface
Within the storefront interface lies an MVC architecture which is the most prevalent use case for Commerce Cloud B2C. There are a few versions of this MVC architecture but all of them sport several Controllers that handle various user interactions on the server-side. To track all the various mutations and creations of data objects you would need to add code to each of those Controllers. Perhaps more appropriately to the Models that Controllers use to create and mutate those data objects.
Imports
There are two ways to import data into the platform:
XML File Import
OCAPI Data API
Both of these import data with no way to trigger a custom behavior based on the result of their actions. You will be effectively blind to when the data was created or modified in many cases.
An approach to remediating this could be a job that looks for objects missing a custom attribute--that this job or customizations to both of the other interfaces set--and adds the custom attribute and/or updates another attribute with a timestamp. In addition to that activity, this job may need to loop over all objects to determine if an import activity changed anything since it last set the aforementioned custom attributes. This could be achieved with yet another custom attribute containing some sort of hash or checksum. This job will need to be running constantly and probably split into two parts that run at different intervals. It is not a performant nor scalable solution.
Instead, and ideally, all systems sending data through these import mechanisms would pre-set the custom attributes so that those fields are updated upon import.
Getting Data Out
In Salesforce Commerce Cloud you can export data either via synchronous external API calls within the storefront request & response context or via asynchronous batch jobs that run in the background. These jobs can write files, transfer them via SFTP, HTTPS, or make external API calls. There is also the OCAPI Data API which could allow you to know when something is added/modified based on polling the API for new data.
In many cases, you are limited by quotas that are in place to help maintain the overall performance of the system.
Approaches
There's a couple of different approaches that you can use to capture and transmit the data necessary to represent these sorts of events. They are summarized below.
An Export Queue
Probably the most performant option is an export queue. Rather than immediately notifying an external system of an event occurring, you can queue up a list of events that have happened and then transmit them to the third party system in a job that runs in the background. The queue is typically constructed using the system's Custom Object concept. As an event occurs you create a new Custom Object which would contain all the necessary information about the event and how to handle that event in the queue. You craft a job component that is added to a job flow that runs periodically. Every 15 minutes for example. This job component would iterate over the queue and perform whatever actions are necessary to transmit that event to the third party system. Once transmitted, the item is removed from the queue.
Just in Time Transmission
You must be careful with this approach as it has the greatest potential to degrade the performance of a merchant's storefront and/or OCAPI interface. As the event occurs, you perform a web service call to the third-party system that collects the event notifications. You must set a pretty aggressive timeout on this request to avoid impacting storefront or API performance too much if the third-party system should become unavailable. I would even recommend combining this approach with the Queue approach described above so that you can add failed API calls to the queue for resending later.
OCAPI Polling
In order to know when something is actually modified or created, you need to implement a custom attribute to track such timestamps. Unfortunately, while there is creationDate and lastModified DateTime stamps on almost every object, they're often not accessible from neither OCAPI nor DW Script APIs. Your custom attributes would require modification to both the OCAPI Hooks and the Storefront Controllers/Models to set those attributes appropriately. Once set, you can query for objects based on those custom attributes using the OCAPI Data API. A third-party system would connect periodically to query for new data objects since it last checked. Note that not all data objects are accessible via the OCAPI Data API and you may be limited on how you can query certain objects so this is by no means a silver bullet approach.
I wish you the best of luck, and should you need any support in making an appropriate solution, there are a number of System Integrator Partners available in the market. You can find them listed on AppExchange. Filter the Consultants by Salesforce B2C Commerce for a tiered list of partners.
Full disclosure: I work for one such partner: Astound Commerce

SalesForce Notifications - Reliable Integration

I need to develop a system that is listening to the changes happened with SalesForce objects and transfers them to my end.
Initially I considered SalesForce Streaming API that allows exactly that - create a push topic that subscribes to objects notifications and later have a set of clients that are reading them using long polling.
However such approach doesn't guarantee durability and reliable delivery of notifications - which I am in need.
What will be the architecture allowing to implement the same functionality in reliable way?
One approach I have in mind is create a Force.com applications that uses SalesForce triggers to subscribe to notifications and later just sends them using HTTPS to the cloud or my Data Server. Will this be a valid option - or are there any better ones?
I two very good questions on salesforce.stackexchange.com covering this very topic in details:
https://salesforce.stackexchange.com/questions/16587/integrating-a-real-time-notification-application-with-salesforce
https://salesforce.stackexchange.com/questions/20600/best-approach-for-a-package-to-respond-to-dml-events-dynamically-without-object

How can I model both data and logic in a database?

I'm working on a web app for a magazine that will allow users to log in and renew their subscriptions online. These subscriptions are renewed based on a set of rules, and I'd like to get some ideas/recommendations on how to set up these rules.
This web app interfaces with an external (third-party) system that has data for the subscriber. When the subscriber logs in, the web app grabs a bunch of information from this third-party system, including a number called a "subscription definition ID" which (ostensibly) denotes the type of subscription that a subscriber has. This subscription type may be several years out of date, so the web app contains a set of "order specifications" (stored in a database) that consists of the current subscribe options, along with information like the current rate (so the price can be shown to the user on the order form).
My current idea is to create a table of subscription definition IDs that map to the order specification to which a given subscription definition ID renews. For example, a subscription definition ID might denote a 1-year subscription from ten years ago, which cost $39.99 back then; in the database, this would map to the current order specification, which would have the current price of $59.99.
This works pretty well in theory, but as usual, there's a catch. When the subscription definition IDs were set up back in the day, they weren't always unique. In particular, one subscription definition ID has wildly different behaviors, depending on context. This subscription definition ID is used for both 1-year subscriptions and 1-year discounted gift subscriptions. Therefore, given this subscription definition ID, a number of things can happen:
If it's a 1-year subscription, he'll renew using the (current) 1-year subscription.
If it's a 1-year discounted gift subscription and the subscriber is not renewing any other subscriptions, it'll renew as a (current) 1-year full-price gift subscription.
If it's a 1-year discounted gift subscription and the subscriber is renewing other subscriptions, it'll renew as a (current) 1-year discounted gift subscription.
I'm not sure how to generalize this in the database, especially since this complication only occurs with one record. I basically need a way to model that above logic which could also work with the records that aren't special cases. I could always do this in code, but I'm reluctant to put all this business-y logic in the code itself (especially in case the problem occurs in the future, with other subscription definition IDs).
What's the best way to model this combination of data and logical rules?
The trick here is to parameterize the business logic, and that means creating a parameters table. The general case is that any kind of subscription is eligible for some other kind of renewal, so you have a table that maps Original Subscription to Eligible Renewals(s). You then have general code that examines a user's subscriptions and shows the 1 option or a list of options for renewals.
For most of your cases, if I understand what you are saying, the original subscription just maps to itself. You just have this one case where some subscriptions map to special cases.
However, if you do it this way, you have a nice general-purpose renewal system that is now under the admin's control, as they can modify the mappings without waiting for you to provide new code.
It isn't something I would normally suggest, but because there is only one subscription definition ID and this has been the situation for a number of years (therefore this is a stable business rule), I suggest hard-coding the behaviour for this ID.

Real-time synchronization of database data across all the clients

What's the best strategy to keep all the clients of a database server synchronized?
The scenario involves a database server and a dynamic number of clients that connect to it, viewing and modifying the data.
I need real-time synchronization of the data across all the clients - if data is added, deleted, or updated, I want all the clients to see the changes in real-time without putting too much strain on the database engine by continuous polling for changes in tables with a couple of million rows.
Now I am using a Firebird database server, but I'm willing to adopt the best technology for the job, so I want to know if there is any kind of already existing framework for this kind of scenario, what database engine does it use and what does it involve?
Firebird has a feature called EVENT that you may be able to use to notify clients of changes to the database. The idea is that when data in a table is changed, a trigger posts an event. Firebird takes care of notifying all clients who have registered an interest in the event by name. Once notified, each client is responsible for refreshing its own data by querying the database.
The client can't get info from the event about the new or old values. This is by design, because there's no way to resolve this with transaction isolation. Nor can your client register for events using wildcards. So you have to design your server-to-client notification pretty broadly, and let the client update to see what exactly changed.
See http://www.firebirdsql.org/doc/whitepapers/events_paper.pdf
You don't mention what client platform or language you're using, so I can't advise on the specific API you would use. I suggest you google for instance "firebird event java" or "firebird event php" or similar, based on the language you're using.
Since you say in a comment that you're using WPF, here's a link to a code sample of some .NET application code registering for notification of an event:
http://www.firebirdsql.org/index.php?op=devel&sub=netprovider&id=examples#3
Re your comment: Yes, the Firebird event mechanism is limited in its ability to carry information. This is necessary because any information it might carry could be canceled or rolled back. For instance if a trigger posts an event but then the operation that spawned the trigger violates a constraint, canceling the operation but not the event. So events can only be a kind of "hint" that something of interest may have happened. The other clients need to refresh their data at that time, but they aren't told what to look for. This is at least better than polling.
So you're basically describing a publish/subscribe mechanism -- a message queue. I'm not sure I'd use an RDBMS to implement a message queue. It can be done, but you're basically reinventing the wheel.
Here are a few message queue products that are well-regarded:
Microsoft MSMQ (seems to be part of Windows Professional and Server editions)
RabbitMQ (free open-source)
Apache ActiveMQ (free open-source)
IBM WebSphere MQ (probably overkill in your case)
This means that when one client modifies data in a way that others may need to know about, that client also has to post a message to the message queue. When consumer clients see the message they're interested in, they know to refresh their copy of some data.
SQL Server 2005 and higher support notification based data source caching expiry.

Resources