Sharing data to client whenever API has results available? - responsive-design

I have the following scenario, can anyone guide what's the best approach:
Front End —> Rest API —> SOAP API (Legacy applications)
The Legacy applications behave unpredictably; sometimes it's very slow, sometimes fast.
The following is what needs to be achieved:
- As and when data is available to Rest API, the results should be made available to the client
- Whatever info is available, show the intermediate results.
Can anyone share insights in how to design this system?

you have several options to do that
polling from the UI - will require some changes to the API, the initial call will return a url where results will be available and the UI will check that out everytime
websockets - will require changing the api
server-sent events - essentially keeping the http connection open and pushing new results as they are available - sounds the closest to what you want

You want some sort of event-based API that the API consumers can subscribe to.
Event-driven architectures come in many forms - from event notification ('hey, I have new data, come and get it') to message/payload delivery; full-on publish/subscribe solutions to that allow consumers to subscribe to one or more "topics", with event back-up and replay functionality to relatively basic ones.
If you don't want a full-on eventing platform, you could look at WebHooks.
A great way to get started will be to start familiarizing yourself with some event-based architecture patterns. That last link is for Chris Richardson's website, he's got a lot of great info on such architectures and would be well worth a look.
In terms of the defining the event API, if you're familiar with OpenAPI, there's AsyncAPI which is the async equivalent.
In terms of solutions, there's a few well known platforms, including open source ones. The big cloud providers (Azure, GCP and AWS) will also have async / event based services you can use.
For more background there's this Wikipedia page (which I have not read - so can't speak for it's quality but it does look detailed).
Update: Webhooks
Webhooks are a bit like an ice-berg, there's more to them than might appear at first glance. A full-on eventing solution will have a very steep learning curve but will solve problems that you'll otherwise have to address separately (write your own code, etc). Two big areas to think about:
Consumer management. How will you onboard new consumers? Is it a small handful of internal systems / URLs that you can manage through some basic config, manually? Or is it external facing for public third parties? If it's the latter, will you need to provide auto-provisioning through a secure developer portal or get them to email/submit details for manual set-up at your end?
Error handling & missed events. Let's say you have an event, you call the subscribing webhook - but there's no response (or an error). What do you do? Do you retry? If so, how often, for how long? Once the consumer is back up what do you do - did you save the old events to replay? How many events? How do you track who has received what?
Polling
#Arnon is right to mention polling as an approach but I'd only do it if you have no other choice, or, if you have a very small number of internal system doing the polling, i.e - incurs low load, and you control both "ends" of the polling; in such a scenario its a valid approach.
But if its for an external API you'll need to implement throttling to protect your systems, as you'll have limited control over who's calling you and how much. Caching will be another obvious topic to explore in a polling approach.

Related

The better way to build a component displaying the states of multiple devices using React

This bounty has ended. Answers to this question are eligible for a +50 reputation bounty. Bounty grace period ends in 16 hours.
Evgen is looking for a canonical answer.
Suppose I need to make a component (using React) that displays a devices state table that displays various device attributes such as their IP address, name, etc. Polling each device takes a few seconds, so you need to display a loading indicator for each specific device in the table.
I have several ways to make a similar component:
On the server side, I can create an API endpoint (like GET /device_info) that returns data for only one requested device and make multiple parallel requests from fontend to that endpoint for each device.
I can create an API endpoint (like GET /devices_info) that returns data for the entire list of devices at once on the server and make one request to it at once for the entire list of devices from frontend.
Each method has its pros and cons:
Way one:
Prons:
Easy to make. We make a "table row" component that requests data only for the device whose data it displays. The "device table" component will consist of several "table row" components that execute multiple queries in parallel each for their own device. This is especially true if you are using libraries such as React Query or RTK Query which support this behavior out of the box;
Cons:
Many requests to the endpoint, possibly hundreds (although the number of parallel requests can be limited);
If, for some reason, synchronization of access to some shared resource on the server side is required and the server supports several workers, then synchronization between workers can be very difficult to achieve, especially if each worker is a separate process;
Way two:
Prons:
One request to the endpoint;
There are no problems with access to some shared resource, everything is controlled within a single request on the server (because guaranteed there will be one worker for the request on the server side);
Cons:
It's hard to make. Since one request essentially has several intermediate states, since polling different devices takes different times, you need to make periodic requests from the UI to get an updated state, and you also need to support an interface that will support several states for each device such as: "not pulled", "in progress", "done";
With this in mind, my questions are:
What is the better way to make described component?
Does the better way have a name? Maybe it's some kind of pattern?
Maybe you know a great book/article/post that describes a solution to a similar problem?
that displays a devices state
The component asking for a device state is so... 2010?
If your device knows its state, then have your device send its state to the component
SSE - Server Sent Events, and the EventSource API
https://developer.mozilla.org/.../API/Server-sent_events
https://developer.mozilla.org/.../API/EventSource
PS. React is your choice; I would go Native JavaScript Web Components, so you have ZERO dependencies on Frameworks or Libraries for the next 30 JavaScript years
Many moons ago, I created a dashboard with PHP backend and Web Components front-end WITHOUT SSE: https://github.com/Danny-Engelman/ITpings
(no longer maintained)
Here is a general outline of how this approach might work:
On the server side, create an API endpoint that returns data for all devices in the table.
When the component is initially loaded, make a request to this endpoint to get the initial data for all devices.
Use this data to render the table, but show a loading indicator for each device that has not yet been polled.
Use a client-side timer to periodically poll each device that is still in a loading state.
When the data for a device is returned, update the table with the new information and remove the loading indicator.
This approach minimizes the number of API requests by polling devices only when necessary, while still providing a responsive user interface that updates in real-time.
As for a name or pattern for this approach, it could be considered a form of progressive enhancement or lazy loading, where the initial data is loaded on the server-side and additional data is loaded on-demand as needed.
Both ways of making the component have their pros and cons, and the choice ultimately depends on the specific requirements and constraints of the project. However, here are some additional points to consider:
Way one:
This approach is known as "rendering by fetching" or "server-driven UI", where the server is responsible for providing the data needed to render the UI. It's a common pattern in modern web development, especially with the rise of GraphQL and serverless architectures.
The main advantage of this approach is its simplicity and modularity. Each "table row" component is responsible for fetching and displaying data for its own device, which makes it easy to reason about and test. It also allows for fine-grained caching and error handling at the component level.
The main disadvantage is the potential for network congestion and server overload, especially if there are a large number of devices to display. This can be mitigated by implementing server-side throttling and client-side caching, but it adds additional complexity.
Way two:
This approach is known as "rendering by rendering" or "client-driven UI", where the client is responsible for driving the rendering logic based on the available data. It's a more traditional approach that relies on client-side JavaScript to manipulate the DOM and update the UI.
The main advantage of this approach is its efficiency and scalability. With only one request to the server, there's less network overhead and server load. It also allows for more granular control over the UI state and transitions, which can be useful for complex interactions.
The main disadvantage is its complexity and brittleness. Managing the UI state and transitions can be difficult, especially when dealing with asynchronous data fetching and error handling. It also requires more client-side JavaScript and DOM manipulation, which can slow down the UI and increase the risk of bugs and performance issues.
In summary, both approaches have their trade-offs and should be evaluated based on the specific needs of the project. There's no one-size-fits-all solution or pattern, but there are several best practices and libraries that can help simplify the implementation and improve the user experience. Some resources to explore include:
React Query and RTK Query for data fetching and caching in React.
Suspense and Concurrent Mode for asynchronous rendering and data loading in React.
GraphQL and Apollo for server-driven data fetching and caching.
Redux and MobX for state management and data flow in React.
Progressive Web Apps (PWAs) and Service Workers for offline-first and resilient web applications.
Both approaches have their own advantages and disadvantages, and the best approach depends on the specific requirements and constraints of your project. However, given that polling each device takes a few seconds and you need to display a loading indicator for each specific device in the table, the first approach (making multiple parallel requests from frontend to an API endpoint that returns data for only one requested device) seems more suitable. This approach allows you to display a loading indicator for each specific device in the table and update each row independently as soon as the data for that device becomes available.
This approach is commonly known as "concurrent data fetching" or "parallel data fetching", and it is supported by many modern front-end libraries and frameworks, such as React Query and RTK Query. These libraries allow you to easily make multiple parallel requests and manage the caching and synchronization of the data.
To implement this approach, you can create a "table row" component that requests data only for the device whose data it displays, and the "device table" component will consist of several "table row" components that execute multiple queries in parallel each for their own device. You can also limit the number of parallel requests to avoid overloading the server.
To learn more about concurrent data fetching and its implementation using React Query or RTK Query, you can refer to their official documentation and tutorials. You can also find many articles and blog posts on this topic by searching for "concurrent data fetching" or "parallel data fetching" in Google or other search engines.

SalesForce Notifications - Reliable Integration

I need to develop a system that is listening to the changes happened with SalesForce objects and transfers them to my end.
Initially I considered SalesForce Streaming API that allows exactly that - create a push topic that subscribes to objects notifications and later have a set of clients that are reading them using long polling.
However such approach doesn't guarantee durability and reliable delivery of notifications - which I am in need.
What will be the architecture allowing to implement the same functionality in reliable way?
One approach I have in mind is create a Force.com applications that uses SalesForce triggers to subscribe to notifications and later just sends them using HTTPS to the cloud or my Data Server. Will this be a valid option - or are there any better ones?
I two very good questions on salesforce.stackexchange.com covering this very topic in details:
https://salesforce.stackexchange.com/questions/16587/integrating-a-real-time-notification-application-with-salesforce
https://salesforce.stackexchange.com/questions/20600/best-approach-for-a-package-to-respond-to-dml-events-dynamically-without-object

State of Map-Reduce on Appengine?

There is appengine-mapreduce which seems the official way to do things on AppEngine. But there seems no documentation besides some hacked together Wiki Pages and lengthy videos. There are statements that the lib only supports the map step. But the source indicates that there are also implementations for shuffle.
A Version of this appengine-mapreduce library seems also to be included in the SDK but it not blessed for public use. So you basically are expected to load the library twice into your runtime.
Then there is appengine-pipeline. "A primary use-case of the API is connecting together various App Engine MapReduces into a computational pipeline." But there also seems pipeline-related code in the appengine-mapreduce library.
So where do I start to find out how this all fits together? Which is the library to call from my project. Is there any decent documentation on appengine-mapreduce besides parsing change logs?
Which is the library to call from my project.
They serve different purposes, and you've provided no details about what you're attempting to do.
The most fundamental layer here is the task queue, which lets you schedule background work that can be highly parallelized. This is fan-out. Let's say you had a list of 1000 websites, and you wanted to check the response time for each one and send an email for any site that takes more than 5 seconds to load. By running these as concurrent tasks, you can complete the work much faster than if you checked all 1000 sites in sequence.
Now let's say you don't want to send an email for every slow site, you just want to check all 1000 sites and send one summary email that says how many took more than 5 seconds and how many took fewer. This is fan-in. It's trickier with the task queue, because you need to know when all tasks have completed, and you need to collect and summarize their results.
Enter the Pipeline API. The Pipeline API abstracts the task queue to make fan-in easier. You write what looks like synchronous, procedural code, but uses Python futures and is executed (as much as possible) in parallel. The Pipeline API keeps track of task dependencies and collects results to facilitate building distributed workflows.
The MapReduce API wraps the Pipeline API to facilitate a specific type of distributed workflow: mapping the results of a piece of work into a set of key/value pairs, and reducing multiple sets of results to one by combining their values.
So they provide increasing layers of abstraction and convenience around a common system of distributed task execution. The right solution depends on what you're trying to accomplish.
There is offical documentation here: https://developers.google.com/appengine/docs/java/dataprocessing/

What's the best way for the client app to immediately react to an update in the database?

What is the best way to program an immediate reaction to an update to data in a database?
The simplest method I could think of offhand is a thread that checks the database for a particular change to some data and continually waits to check it again for some predefined length of time. This solution seems to be wasteful and suboptimal to me, so I was wondering if there is a better way.
I figure there must be some way, after all, a web application like gmail seems to be able to update my inbox almost immediately after a new email was sent to me. Surely my client isn't continually checking for updates all the time. I think the way they do this is with AJAX, but how AJAX can behave like a remote function call I don't know. I'd be curious to know how gmail does this, but what I'd most like to know is how to do this in the general case with a database.
Edit:
Please note I want to immediately react to the update in the client code, not in the database itself, so as far as I know triggers can't do this. Basically I want the USER to get a notification or have his screen updated once the change in the database has been made.
You basically have two issues here:
You want a browser to be able to receive asynchronous events from the web application server without polling in a tight loop.
You want the web application to be able to receive asynchronous events from the database without polling in a tight loop.
For Problem #1
See these wikipedia links for the type of techniques I think you are looking for:
Comet
Reverse AJAX
HTTP Server Push
EDIT: 19 Mar 2009 - Just came across ReverseHTTP which might be of interest for Problem #1.
For Problem #2
The solution is going to be specific to which database you are using and probably the database driver your server uses too. For instance, with PostgreSQL you would use LISTEN and NOTIFY. (And at the risk of being down-voted, you'd probably use database triggers to call the NOTIFY command upon changes to the table's data.)
Another possible way to do this is if the database has an interface to create stored procedures or triggers that link to a dynamic library (i.e., a DLL or .so file). Then you could write the server signalling code in C or whatever.
On the same theme, some databases allow you to write stored procedures in languages such as Java, Ruby, Python and others. You might be able to use one of these (instead of something that compiles to a machine code DLL like C does) for the signalling mechanism.
Hope that gives you enough ideas to get started.
I figure there must be some way, after
all, web application like gmail seem
to update my inbox almost immediately
after a new email was sent to me.
Surely my client isn't continually
checking for updates all the time. I
think the way they do this is with
AJAX, but how AJAX can behave like a
remote function call I don't know. I'd
be curious to know how gmail does
this, but what I'd most like to know
is how to do this in the general case
with a database.
Take a peek with wireshark sometime... there's some google traffic going on there quite regularly, it appears.
Depending on your DB, triggers might help. An app I wrote relies on triggers but I use a polling mechanism to actually 'know' that something has changed. Unless you can communicate the change out of the DB, some polling mechanism is necessary, I would say.
Just my two cents.
Well, the best way is a database trigger. Depends on the ability of your DBMS, which you haven't specified, to support them.
Re your edit: The way applications like Gmail do it is, in fact, with AJAX polling. Install the Tamper Data Firefox extension to see it in action. The trick there is to keep your polling query blindingly fast in the "no news" case.
Unfortunately there's no way to push data to a web browser - you can only ever send data as a response to a request - that's just the way HTTP works.
AJAX is what you want to use though: calling a web service once a second isn't excessive, provided you design the web service to ensure it receives a small amount of data, sends a small amount back, and can run very quickly to generate that response.

.NET CF mobile device application - best methodology to handle potential offline-ness?

I'm building a mobile application in VB.NET (compact framework), and I'm wondering what the best way to approach the potential offline interactions on the device. Basically, the devices have cellular and 802.11, but may still be offline (where there's poor reception, etc). A driver will scan boxes as they leave his truck, and I want to update the new location - immediately if there's network signal, or queued if it's offline and handled later. It made me think, though, about how to handle offline-ness in general.
Do I cache as much data to the device as I can so that I use it if it's offline - Essentially, each device would have a copy of the (relevant) production data on it? Or is it better to disable certain functionality when it's offline, so as to avoid the headache of synchronization later? I know this is a pretty specific question that depends on my app, but I'm curious to see if others have taken this route.
Do I build the application itself to act as though it's always offline, submitting everything to a local queue of sorts that's owned by a local class (essentially abstracting away the online/offline thing), and then have the class submit things to the server as it can? What about data lookups - how can those be handled in a "Semi-live" fashion?
Or should I have the application attempt to submit requests to the server directly, in real-time, and handle it if it itself request fails? I can see a potential problem of making the user wait for the timeout, but is this the most reliable way to do it?
I'm not looking for a specific solution, but really just stories of how developers accomplish this with the smoothest user experience possible, with a link to a how-to or heres-what-to-consider or something like that. Thanks for your pointers on this!
We can't give you a definitive answer because there is no "right" answer that fits all usage scenarios. For example if you're using SQL Server on the back end and SQL CE locally, you could always set up merge replication and have the data engine handle all of this for you. That's pretty clean. Using the offline application block might solve it. Using store and forward might be an option.
You could store locally and then roll your own synchronization with a direct connection, web service of WCF service used when a network is detected. You could use MSMQ for delivery.
What you have to think about is not what the "right" way is, but how your implementation will affect application usability. If you disable features due to lack of connectivity, is the app still usable? If you have stale data, is that a problem? Maybe some critical data needs to be transferred when you have GSM/GPRS (which typically isn't free) and more would be done when you have 802.11. Maybe you can run all day with lookup tables pulled down in the morning and upload only transactions, with the device tracking what changes it's made.
Basically it really depends on how it's used, the nature of the data, the importance of data transactions between fielded devices, the effect of data latency, and probably other factors I can't think of offhand.
So the first step is to determine how the app needs to be used, then determine the infrastructure and architecture to provide the connectivity and data access required.
I haven't used it myself, but have you looked into the "store and forward" capabilities of the CF? It may suit your needs. I believe it uses an Exchange mailbox as a message queue to send SOAP packets to and from the device.
The best way to approach this is to always work offline, then use message queues to handle sending changes to and from the device. When the driver marks something as delivered, for example, update the item as delivered in your local store and also place a message in an outgoing queue to tell the server it's been delivered. When the connection is up, send any queued items back to the server and get any messages that have been queued up from the server.

Resources