What is the typical way of handling events such as new customer registered, cart updated, order posted from a cartridge in SFCC B2C?
Here I can see that some resources 'support server-side customization' by proving hooks, but the number of resources support customization is small and there are no hooks for 'customer registered' or 'cart updated' events there.
Vitaly, I assume you are speaking specifically about the OCAPI interface and not the traditional Demandware Script storefront interface. However, I will answer for both contexts. There is not a single interface that would grant you the ability to know when an event occurs. Furthermore, there are multiple interfaces that can trigger such events:
Open Commerce API (OCAPI)
If you wish to listen to and/or notify an external service of an event that is triggered using this interface, you must use the appropriate hook for the resource for which you want to track the creation or modification. This hook is written in Demandware Script (ECMAScript 5 + custom extensions)
Storefront Interface
Within the storefront interface lies an MVC architecture which is the most prevalent use case for Commerce Cloud B2C. There are a few versions of this MVC architecture but all of them sport several Controllers that handle various user interactions on the server-side. To track all the various mutations and creations of data objects you would need to add code to each of those Controllers. Perhaps more appropriately to the Models that Controllers use to create and mutate those data objects.
Imports
There are two ways to import data into the platform:
XML File Import
OCAPI Data API
Both of these import data with no way to trigger a custom behavior based on the result of their actions. You will be effectively blind to when the data was created or modified in many cases.
An approach to remediating this could be a job that looks for objects missing a custom attribute--that this job or customizations to both of the other interfaces set--and adds the custom attribute and/or updates another attribute with a timestamp. In addition to that activity, this job may need to loop over all objects to determine if an import activity changed anything since it last set the aforementioned custom attributes. This could be achieved with yet another custom attribute containing some sort of hash or checksum. This job will need to be running constantly and probably split into two parts that run at different intervals. It is not a performant nor scalable solution.
Instead, and ideally, all systems sending data through these import mechanisms would pre-set the custom attributes so that those fields are updated upon import.
Getting Data Out
In Salesforce Commerce Cloud you can export data either via synchronous external API calls within the storefront request & response context or via asynchronous batch jobs that run in the background. These jobs can write files, transfer them via SFTP, HTTPS, or make external API calls. There is also the OCAPI Data API which could allow you to know when something is added/modified based on polling the API for new data.
In many cases, you are limited by quotas that are in place to help maintain the overall performance of the system.
Approaches
There's a couple of different approaches that you can use to capture and transmit the data necessary to represent these sorts of events. They are summarized below.
An Export Queue
Probably the most performant option is an export queue. Rather than immediately notifying an external system of an event occurring, you can queue up a list of events that have happened and then transmit them to the third party system in a job that runs in the background. The queue is typically constructed using the system's Custom Object concept. As an event occurs you create a new Custom Object which would contain all the necessary information about the event and how to handle that event in the queue. You craft a job component that is added to a job flow that runs periodically. Every 15 minutes for example. This job component would iterate over the queue and perform whatever actions are necessary to transmit that event to the third party system. Once transmitted, the item is removed from the queue.
Just in Time Transmission
You must be careful with this approach as it has the greatest potential to degrade the performance of a merchant's storefront and/or OCAPI interface. As the event occurs, you perform a web service call to the third-party system that collects the event notifications. You must set a pretty aggressive timeout on this request to avoid impacting storefront or API performance too much if the third-party system should become unavailable. I would even recommend combining this approach with the Queue approach described above so that you can add failed API calls to the queue for resending later.
OCAPI Polling
In order to know when something is actually modified or created, you need to implement a custom attribute to track such timestamps. Unfortunately, while there is creationDate and lastModified DateTime stamps on almost every object, they're often not accessible from neither OCAPI nor DW Script APIs. Your custom attributes would require modification to both the OCAPI Hooks and the Storefront Controllers/Models to set those attributes appropriately. Once set, you can query for objects based on those custom attributes using the OCAPI Data API. A third-party system would connect periodically to query for new data objects since it last checked. Note that not all data objects are accessible via the OCAPI Data API and you may be limited on how you can query certain objects so this is by no means a silver bullet approach.
I wish you the best of luck, and should you need any support in making an appropriate solution, there are a number of System Integrator Partners available in the market. You can find them listed on AppExchange. Filter the Consultants by Salesforce B2C Commerce for a tiered list of partners.
Full disclosure: I work for one such partner: Astound Commerce
Related
This bounty has ended. Answers to this question are eligible for a +50 reputation bounty. Bounty grace period ends in 16 hours.
Evgen is looking for a canonical answer.
Suppose I need to make a component (using React) that displays a devices state table that displays various device attributes such as their IP address, name, etc. Polling each device takes a few seconds, so you need to display a loading indicator for each specific device in the table.
I have several ways to make a similar component:
On the server side, I can create an API endpoint (like GET /device_info) that returns data for only one requested device and make multiple parallel requests from fontend to that endpoint for each device.
I can create an API endpoint (like GET /devices_info) that returns data for the entire list of devices at once on the server and make one request to it at once for the entire list of devices from frontend.
Each method has its pros and cons:
Way one:
Prons:
Easy to make. We make a "table row" component that requests data only for the device whose data it displays. The "device table" component will consist of several "table row" components that execute multiple queries in parallel each for their own device. This is especially true if you are using libraries such as React Query or RTK Query which support this behavior out of the box;
Cons:
Many requests to the endpoint, possibly hundreds (although the number of parallel requests can be limited);
If, for some reason, synchronization of access to some shared resource on the server side is required and the server supports several workers, then synchronization between workers can be very difficult to achieve, especially if each worker is a separate process;
Way two:
Prons:
One request to the endpoint;
There are no problems with access to some shared resource, everything is controlled within a single request on the server (because guaranteed there will be one worker for the request on the server side);
Cons:
It's hard to make. Since one request essentially has several intermediate states, since polling different devices takes different times, you need to make periodic requests from the UI to get an updated state, and you also need to support an interface that will support several states for each device such as: "not pulled", "in progress", "done";
With this in mind, my questions are:
What is the better way to make described component?
Does the better way have a name? Maybe it's some kind of pattern?
Maybe you know a great book/article/post that describes a solution to a similar problem?
that displays a devices state
The component asking for a device state is so... 2010?
If your device knows its state, then have your device send its state to the component
SSE - Server Sent Events, and the EventSource API
https://developer.mozilla.org/.../API/Server-sent_events
https://developer.mozilla.org/.../API/EventSource
PS. React is your choice; I would go Native JavaScript Web Components, so you have ZERO dependencies on Frameworks or Libraries for the next 30 JavaScript years
Many moons ago, I created a dashboard with PHP backend and Web Components front-end WITHOUT SSE: https://github.com/Danny-Engelman/ITpings
(no longer maintained)
Here is a general outline of how this approach might work:
On the server side, create an API endpoint that returns data for all devices in the table.
When the component is initially loaded, make a request to this endpoint to get the initial data for all devices.
Use this data to render the table, but show a loading indicator for each device that has not yet been polled.
Use a client-side timer to periodically poll each device that is still in a loading state.
When the data for a device is returned, update the table with the new information and remove the loading indicator.
This approach minimizes the number of API requests by polling devices only when necessary, while still providing a responsive user interface that updates in real-time.
As for a name or pattern for this approach, it could be considered a form of progressive enhancement or lazy loading, where the initial data is loaded on the server-side and additional data is loaded on-demand as needed.
Both ways of making the component have their pros and cons, and the choice ultimately depends on the specific requirements and constraints of the project. However, here are some additional points to consider:
Way one:
This approach is known as "rendering by fetching" or "server-driven UI", where the server is responsible for providing the data needed to render the UI. It's a common pattern in modern web development, especially with the rise of GraphQL and serverless architectures.
The main advantage of this approach is its simplicity and modularity. Each "table row" component is responsible for fetching and displaying data for its own device, which makes it easy to reason about and test. It also allows for fine-grained caching and error handling at the component level.
The main disadvantage is the potential for network congestion and server overload, especially if there are a large number of devices to display. This can be mitigated by implementing server-side throttling and client-side caching, but it adds additional complexity.
Way two:
This approach is known as "rendering by rendering" or "client-driven UI", where the client is responsible for driving the rendering logic based on the available data. It's a more traditional approach that relies on client-side JavaScript to manipulate the DOM and update the UI.
The main advantage of this approach is its efficiency and scalability. With only one request to the server, there's less network overhead and server load. It also allows for more granular control over the UI state and transitions, which can be useful for complex interactions.
The main disadvantage is its complexity and brittleness. Managing the UI state and transitions can be difficult, especially when dealing with asynchronous data fetching and error handling. It also requires more client-side JavaScript and DOM manipulation, which can slow down the UI and increase the risk of bugs and performance issues.
In summary, both approaches have their trade-offs and should be evaluated based on the specific needs of the project. There's no one-size-fits-all solution or pattern, but there are several best practices and libraries that can help simplify the implementation and improve the user experience. Some resources to explore include:
React Query and RTK Query for data fetching and caching in React.
Suspense and Concurrent Mode for asynchronous rendering and data loading in React.
GraphQL and Apollo for server-driven data fetching and caching.
Redux and MobX for state management and data flow in React.
Progressive Web Apps (PWAs) and Service Workers for offline-first and resilient web applications.
Both approaches have their own advantages and disadvantages, and the best approach depends on the specific requirements and constraints of your project. However, given that polling each device takes a few seconds and you need to display a loading indicator for each specific device in the table, the first approach (making multiple parallel requests from frontend to an API endpoint that returns data for only one requested device) seems more suitable. This approach allows you to display a loading indicator for each specific device in the table and update each row independently as soon as the data for that device becomes available.
This approach is commonly known as "concurrent data fetching" or "parallel data fetching", and it is supported by many modern front-end libraries and frameworks, such as React Query and RTK Query. These libraries allow you to easily make multiple parallel requests and manage the caching and synchronization of the data.
To implement this approach, you can create a "table row" component that requests data only for the device whose data it displays, and the "device table" component will consist of several "table row" components that execute multiple queries in parallel each for their own device. You can also limit the number of parallel requests to avoid overloading the server.
To learn more about concurrent data fetching and its implementation using React Query or RTK Query, you can refer to their official documentation and tutorials. You can also find many articles and blog posts on this topic by searching for "concurrent data fetching" or "parallel data fetching" in Google or other search engines.
I have the following scenario, can anyone guide what's the best approach:
Front End —> Rest API —> SOAP API (Legacy applications)
The Legacy applications behave unpredictably; sometimes it's very slow, sometimes fast.
The following is what needs to be achieved:
- As and when data is available to Rest API, the results should be made available to the client
- Whatever info is available, show the intermediate results.
Can anyone share insights in how to design this system?
you have several options to do that
polling from the UI - will require some changes to the API, the initial call will return a url where results will be available and the UI will check that out everytime
websockets - will require changing the api
server-sent events - essentially keeping the http connection open and pushing new results as they are available - sounds the closest to what you want
You want some sort of event-based API that the API consumers can subscribe to.
Event-driven architectures come in many forms - from event notification ('hey, I have new data, come and get it') to message/payload delivery; full-on publish/subscribe solutions to that allow consumers to subscribe to one or more "topics", with event back-up and replay functionality to relatively basic ones.
If you don't want a full-on eventing platform, you could look at WebHooks.
A great way to get started will be to start familiarizing yourself with some event-based architecture patterns. That last link is for Chris Richardson's website, he's got a lot of great info on such architectures and would be well worth a look.
In terms of the defining the event API, if you're familiar with OpenAPI, there's AsyncAPI which is the async equivalent.
In terms of solutions, there's a few well known platforms, including open source ones. The big cloud providers (Azure, GCP and AWS) will also have async / event based services you can use.
For more background there's this Wikipedia page (which I have not read - so can't speak for it's quality but it does look detailed).
Update: Webhooks
Webhooks are a bit like an ice-berg, there's more to them than might appear at first glance. A full-on eventing solution will have a very steep learning curve but will solve problems that you'll otherwise have to address separately (write your own code, etc). Two big areas to think about:
Consumer management. How will you onboard new consumers? Is it a small handful of internal systems / URLs that you can manage through some basic config, manually? Or is it external facing for public third parties? If it's the latter, will you need to provide auto-provisioning through a secure developer portal or get them to email/submit details for manual set-up at your end?
Error handling & missed events. Let's say you have an event, you call the subscribing webhook - but there's no response (or an error). What do you do? Do you retry? If so, how often, for how long? Once the consumer is back up what do you do - did you save the old events to replay? How many events? How do you track who has received what?
Polling
#Arnon is right to mention polling as an approach but I'd only do it if you have no other choice, or, if you have a very small number of internal system doing the polling, i.e - incurs low load, and you control both "ends" of the polling; in such a scenario its a valid approach.
But if its for an external API you'll need to implement throttling to protect your systems, as you'll have limited control over who's calling you and how much. Caching will be another obvious topic to explore in a polling approach.
I buiding a graphQL server to wrap a multiple restful APIs. Some of the APIs that i will be integrating are third party and some that we own. We use redis as caching layer. Is it okay if i implement dataloader caching on graphQL? Will it have an effect on my existing redis caching?
Dataloader does not only serve one purpose. In fact there are three purposes that dataloader serves.
Caching: You mentioned caching. I am assuming that you are building a GraphQL gateway/proxy in front of your GraphQL API. Caching in this case means that when you need a specific resource and later on you will need it again, you can reach back to the cached value. This caching happens in memory of your JavaScript application and usually does not conflict with any other type of caching e.g. on the network.
Batching: Since queries can be nested quite deeply you will ultimately get to a point where you request multiple values of the same resource type in different parts of your query execution. Dataloader will basically collect them and resolve resources like a cascade. Requests flow into a queue and are kept there until the execution cycle ends. Then they are all "released" at once (and probably can be resolved in batches). Also the delivered Promises are all resolved at once (even if some results come in earlier than others). This allows the next execution level to also happen within one cycle.
Deduplication: Let's say you fetch a list of BlogPost with a field author of type User. In this list multiple blog posts have been written by the same author. When the same key is requested twice it will only be delivered once to the batch function. Dataloader will then take care of delivering the resources back by resolving the repective promises.
The point is that (1) and (3) can be achieved with a decent http client that caches the requests (and not only the responses, that means does not fire another request when one is already running for that resource). This means that the interesting question is if your REST API supports batch requests (e.g. api/user/1,2 in one requests instead of api/user/1 and api/user/2). If so, using dataloader can massively improve the performance of your API.
Maybe you want to look into what Apollo is building right now with their RESTDatasource: https://www.apollographql.com/docs/apollo-server/v2/features/data-sources.html#REST-Data-Source
I need to develop a system that is listening to the changes happened with SalesForce objects and transfers them to my end.
Initially I considered SalesForce Streaming API that allows exactly that - create a push topic that subscribes to objects notifications and later have a set of clients that are reading them using long polling.
However such approach doesn't guarantee durability and reliable delivery of notifications - which I am in need.
What will be the architecture allowing to implement the same functionality in reliable way?
One approach I have in mind is create a Force.com applications that uses SalesForce triggers to subscribe to notifications and later just sends them using HTTPS to the cloud or my Data Server. Will this be a valid option - or are there any better ones?
I two very good questions on salesforce.stackexchange.com covering this very topic in details:
https://salesforce.stackexchange.com/questions/16587/integrating-a-real-time-notification-application-with-salesforce
https://salesforce.stackexchange.com/questions/20600/best-approach-for-a-package-to-respond-to-dml-events-dynamically-without-object
We need to keep our Firebase data in sync with other databases for full-text search (in ElasticSearch) and other kinds of queries that Firebase doesn't easily support.
This needs to be as close to real-time as possible, we can't just export a nightly dump of the Firebase JSON or anything like that, aside from the fact that this will get rather large.
My initial thought was to run a Node.js client which listens to child_changed, child_added, child_removed etc... events of all the main lists, but this could get a bit unweildy and would it be a reliable way of syncing if the client re-connects after a period of time?
My next thought was to maintain a list of "items changed" events and write to that every time an item is created/updated, similar to the Firebase work queue example. The queue could contain the full path to the data which has changed and the worker just consumes that and updates the local database accordingly.
The problem here is every bit of code which makes updates has to remember to write to this queue otherwise the two systems will get out of sync. Some proxy code shouldn't be too hard to write though.
Has anyone else done anything similar with any success?
For search queries, you can integrate directly with ElasticSearch; there is no need to sync with a secondary database. Firebase has a blog post about integrating and a lib, Flashlight, to make this quick and painless.
Another option is to use the logstash-input-firebase Logstash plugin in order to listen to changes in your Firebase real-time database(s) and forward the data in real-time to Elasticsearch using an elasticsearch output.